### Refine

#### Year of publication

- 2011 (2) (remove)

#### Language

- English (2) (remove)

#### Has Fulltext

- yes (2) (remove)

#### Keywords

- Algorithmische Lerntheorie (1)
- Automata Theory (1)
- Automatentheorie (1)
- Directed Graphs (1)
- Formal languages (1)
- Gerichteter Graph (1)
- Graph Minors (1)
- Graph Rewriting (1)
- Mathematische Lerntheorie (1)
- Minor <Graphentheorie> (1)
- Reduktionssystem (1)
- Regular Expressions (1)
- RegulĂ¤rer Ausdruck (1)
- Theoretische Informatik (1)
- grammatical inference (1)

#### Institute

- Informatik (2) (remove)

This thesis centers on formal tree languages and on their learnability by algorithmic methods in abstractions of several learning settings. After a general introduction, we present a survey of relevant definitions for the formal tree concept as well as special cases (strings) and refinements (multi-dimensional trees) thereof. In Chapter 3 we discuss the theoretical foundations of algorithmic learning in a specific type of setting of particular interest in the area of Grammatical Inference where the task consists in deriving a correct formal description for an unknown target language from various information sources (queries and/or finite samples) in a polynomial number of steps. We develop a parameterized meta-algorithm that incorporates several prominent learning algorithms from the literature in order to highlight the basic routines which regardless of the nature of the information sources have to be run through by all those algorithms alike. In this framework, the intended target descriptions are deterministic finite-state tree automata. We discuss the limited transferability of this approach to another class of descriptions, residual finite-state tree automata, for which we propose several learning algorithms as well. The learnable class by these techniques corresponds to the class of regular tree languages. In Chapter 4we outline a recent range of attempts in Grammatical Inference to extend the learnable language classes beyond regularity and even beyond context-freeness by techniques based on syntactic observations which can be subsumed under the term 'distributional learning', and we describe learning algorithms in several settings for the tree case taking this approach. We conclude with some general reflections on the notion of learning from structural information.

This work is concerned with two kinds of objects: regular expressions and finite automata. These formalisms describe regular languages, i.e., sets of strings that share a comparatively simple structure. Such languages--- and, in turn, expressions and automata --- are used in the description of textual patterns, workflow and dependence modeling, or formal verification. Testing words for membership in any given such language can be implemented using a fixed --- i.e., finite --- amount of memory, which is conveyed by the phrasing finite''automaton. In this aspect they differ from more general classes, which require potentially unbound memory, but have the potential to model less regular, i.e., more involved, objects. Other than expressions and automata, there are several further formalisms to describe regular languages. These formalisms are all equivalent and conversions among them are well-known.However, expressions and automata are arguably the notions which are used most frequently: regular expressions come natural to humans in order to express patterns, while finite automata translate immediately to efficient data structures. This raises the interest in methods to translate among the two notions efficiently. In particular,the direction from expressions to automata, or from human input to machine representation, is of great practical relevance. Probably the most frequent application that involves regular expressions and finite automata is pattern matching in static text and streaming data. Common tools to locate instances of a pattern in a text are the grep application or its (many) derivatives, as well as awk, sed and lex. Notice that these programs accept slightly more general patterns, namely ''POSIX expressions''. Concerning streaming data, regular expressions are nowadays used to specify filter rules in routing hardware.These applications have in common that an input pattern is specified in form a regular expression while the execution applies a regular automaton. As it turns out, the effort that is necessary to describe a regular language, i.e., the size of the descriptor,varies with the chosen representation. For example, in the case of regular expressions and finite automata, it is rather easy to see that any regular expression can be converted to a finite automaton whose size is linear in that of the expression. For the converse direction, however, it is known that there are regular languages for which the size of the smallest describing expression is exponential in the size of the smallest describing automaton.This brings us to the subject at the core of the present work: we investigate conversions between expressions and automata and take a closer look at the properties that exert an influence on the relative sizes of these objects.We refer to the aspects involved with these consideration under the titular term of Relative Descriptional Complexity.