Refine
Document Type
- Article (1)
- Doctoral Thesis (1)
Keywords
Institute
- Fachbereich 4 (1)
- Informatik (1)
Similarity-based retrieval of semantic graphs is a core task of Process-Oriented Case-Based Reasoning (POCBR) with applications in real-world scenarios, e.g., in smart manufacturing. The involved similarity computation is usually complex and time-consuming, as it requires some kind of inexact graph matching. To tackle these problems, we present an approach to modeling similarity measures based on embedding semantic graphs via Graph Neural Networks (GNNs). Therefore, we first examine how arbitrary semantic graphs, including node and edge types and their knowledge-rich semantic annotations, can be encoded in a numeric format that is usable by GNNs. Given this, the architecture of two generic graph embedding models from the literature is adapted to enable their usage as a similarity measure for similarity-based retrieval. Thereby, one of the two models is more optimized towards fast similarity prediction, while the other model is optimized towards knowledge-intensive, more expressive predictions. The evaluation examines the quality and performance of these models in preselecting retrieval candidates and in approximating the ground-truth similarities of a graph-matching-based similarity measure for two semantic graph domains. The results show the great potential of the approach for use in a retrieval scenario, either as a preselection model or as an approximation of a graph similarity measure.
Case-Based Reasoning (CBR) is a symbolic Artificial Intelligence (AI) approach that has been successfully applied across various domains, including medical diagnosis, product configuration, and customer support, to solve problems based on experiential knowledge and analogy. A key aspect of CBR is its problem-solving procedure, where new solutions are created by referencing similar experiences, which makes CBR explainable and effective even with small amounts of data. However, one of the most significant challenges in CBR lies in defining and computing meaningful similarities between new and past problems, which heavily relies on domain-specific knowledge. This knowledge, typically only available through human experts, must be manually acquired, leading to what is commonly known as the knowledge-acquisition bottleneck.
One way to mitigate the knowledge-acquisition bottleneck is through a hybrid approach that combines the symbolic reasoning strengths of CBR with the learning capabilities of Deep Learning (DL), a sub-symbolic AI method. DL, which utilizes deep neural networks, has gained immense popularity due to its ability to automatically learn from raw data to solve complex AI problems such as object detection, question answering, and machine translation. While DL minimizes manual knowledge acquisition by automatically training models from data, it comes with its own limitations, such as requiring large datasets, and being difficult to explain, often functioning as a "black box". By bringing together the symbolic nature of CBR and the data-driven learning abilities of DL, a neuro-symbolic, hybrid AI approach can potentially overcome the limitations of both methods, resulting in systems that are both explainable and capable of learning from data.
The focus of this thesis is on integrating DL into the core task of similarity assessment within CBR, specifically in the domain of process management. Processes are fundamental to numerous industries and sectors, with process management techniques, particularly Business Process Management (BPM), being widely applied to optimize organizational workflows. Process-Oriented Case-Based Reasoning (POCBR) extends traditional CBR to handle procedural data, enabling applications such as adaptive manufacturing, where past processes are analyzed to find alternative solutions when problems arise. However, applying CBR to process management introduces additional complexity, as procedural cases are typically represented as semantically annotated graphs, increasing the knowledge-acquisition effort for both case modeling and similarity assessment.
The key contributions of this thesis are as follows: It presents a method for preparing procedural cases, represented as semantic graphs, to be used as input for neural networks. Handling such complex, structured data represents a significant challenge, particularly given the scarcity of available process data in most organizations. To overcome the issue of data scarcity, the thesis proposes data augmentation techniques to artificially expand the process datasets, enabling more effective training of DL models. Moreover, it explores several deep learning architectures and training setups for learning similarity measures between procedural cases in POCBR applications. This includes the use of experience-based Hyperparameter Optimization (HPO) methods to fine-tune the deep learning models.
Additionally, the thesis addresses the computational challenges posed by graph-based similarity assessments in CBR. The traditional method of determining similarity through subgraph isomorphism checks, which compare nodes and edges across graphs, is computationally expensive. To alleviate this issue, the hybrid approach seeks to use DL models to approximate these similarity calculations more efficiently, thus reducing the computational complexity involved in graph matching.
The experimental evaluations of the corresponding contributions provide consistent results that indicate the benefits of using DL-based similarity measures and case retrieval methods in POCBR applications. The comparison with existing methods, e.g., based on subgraph isomorphism, shows several advantages but also some disadvantages of the compared methods. In summary, the methods and contributions outlined in this work enable more efficient and robust applications of hybrid CBR and DL in process management applications.