Representation Learning techniques play a crucial role in a wide variety of Deep Learning applications. From Language Generation to Link Prediction on Graphs, learned numerical vector representations often build the foundation for numerous downstream tasks.
In Natural Language Processing, word embeddings are contextualized and depend on their current context. This useful property reflects how words can have different meanings based on their neighboring words.
In Knowledge Graph Embedding (KGE) approaches, static vector representations are still the dominant approach. While this is sufficient for applications where the underlying Knowledge Graph (KG) mainly stores static information, it becomes a disadvantage when dynamic entity behavior needs to be modelled.
To address this issue, KGE approaches would need to model dynamic entities by incorporating situational and sequential context into the vector representations of entities. Analogous to contextualised word embeddings, this would allow entity embeddings to change depending on their history and current situational factors.
Therefore, this thesis provides a description of how to transform static KGE approaches to contextualised dynamic approaches and how the specific characteristics of different dynamic scenarios are need to be taken into consideration.
As a starting point, we conduct empirical studies that attempt to integrate sequential and situational context into static KG embeddings and investigate the limitations of the different approaches. In a second step, the identified limitations serve as guidance for developing a framework that enables KG embeddings to become truly dynamic, taking into account both the current situation and the past interactions of an entity. The two main contributions in this step are the introduction of the temporally contextualized Knowledge Graph formalism and the corresponding RETRA framework which realizes the contextualisation of entity embeddings.
Finally, we demonstrate how situational contextualisation can be realized even in static environments, where all object entities are passive at all times.
For this, we introduce a novel task that requires the combination of multiple context modalities and their integration with a KG based view on entity behavior.
While humans find it easy to process visual information from the real world, machines struggle with this task due to the unstructured and complex nature of the information. Computer vision (CV) is the approach of artificial intelligence that attempts to automatically analyze, interpret, and extract such information. Recent CV approaches mainly use deep learning (DL) due to its very high accuracy. DL extracts useful features from unstructured images in a training dataset to use them for specific real-world tasks. However, DL requires a large number of parameters, computational power, and meaningful training data, which can be noisy, sparse, and incomplete for specific domains. Furthermore, DL tends to learn correlations from the training data that do not occur in reality, making DNNs poorly generalizable and error-prone.
Therefore, the field of visual transfer learning is seeking methods that are less dependent on training data and are thus more applicable in the constantly changing world. One idea is to enrich DL with prior knowledge. Knowledge graphs (KG) serve as a powerful tool for this purpose because they can formalize and organize prior knowledge based on an underlying ontological schema. They contain symbolic operations such as logic, rules, and reasoning, and can be created, adapted, and interpreted by domain experts. Due to the abstraction potential of symbols, KGs provide good prerequisites for generalizing their knowledge. To take advantage of the generalization properties of KG and the ability of DL to learn from large-scale unstructured data, attempts have long been made to combine explicit graph and implicit vector representations. However, with the recent development of knowledge graph embedding methods, where a graph is transferred into a vector space, new perspectives for a combination in vector space are opening up.
In this work, we attempt to combine prior knowledge from a KG with DL to improve visual transfer learning using the following steps: First, we explore the potential benefits of using prior knowledge encoded in a KG for DL-based visual transfer learning. Second, we investigate approaches that already combine KG and DL and create a categorization based on their general idea of knowledge integration. Third, we propose a novel method for the specific category of using the knowledge graph as a trainer, where a DNN is trained to adapt to a representation given by prior knowledge of a KG. Fourth, we extend the proposed method by extracting relevant context in the form of a subgraph of the KG to investigate the relationship between prior knowledge and performance on a specific CV task. In summary, this work provides deep insights into the combination of KG and DL, with the goal of making DL approaches more generalizable, more efficient, and more interpretable through prior knowledge.