OPUS 4 | Search

A comparative view on statistical matching (2022)

Borsi, Lisa

Statistical matching offers a way to broaden the scope of analysis without increasing respondent burden and costs. These would result from conducting a new survey or adding variables to an existing one. Statistical matching aims at combining two datasets A and B referring to the same target population in order to analyse variables, say Y and Z, together, that initially were not jointly observed. The matching is performed based on matching variables X that correspond to common variables present in both datasets A and B. Furthermore, Y is only observed in B and Z is only observed in A. To overcome the fact that no joint information on X, Y and Z is available, statistical matching procedures have to rely on suitable assumptions. Therefore, to yield a theoretical foundation for statistical matching, most procedures rely on the conditional independence assumption (CIA), i.e. given X, Y is independent of Z. The goal of this thesis is to encompass both the statistical matching process and the analysis of the matched dataset. More specifically, the aim is to estimate a linear regression model for Z given Y and possibly other covariates in data A. Since the validity of the assumptions underlying the matching process determine the validity of the obtained matched file, the accuracy of statistical inference is determined by the suitability of the assumptions. By putting the focus on these assumptions, this work proposes a systematic categorisation of approaches to statistical matching by relying on graphical representations in form of directed acyclic graphs. These graphs are particularly useful in representing dependencies and independencies which are at the heart of the statistical matching problem. The proposed categorisation distinguishes between (a) joint modelling of the matching and the analysis (integrated approach), and (b) matching subsequently followed by statistical analysis of the matched dataset (classical approach). Whereas the classical approach relies on the CIA, implementations of the integrated approach are only valid if they converge, i.e. if the specified models are identifiable and, in the case of MCMC implementations, if the algorithm converges to a proper distribution. In this thesis an implementation of the integrated approach is proposed, where the imputation step and the estimation step are jointly modelled through a fully Bayesian MCMC estimation. It is based on a linear regression model for Z given Y and accounts for both a linear regression model and a random effects model for Y. Furthermore, it yields its validity when the instrumental variable assumption (IVA) holds. The IVA corresponds to: (a) Z is independent of a subset X’ of X given Y and X*, where X* = X\X’ and (b) Y is correlated with X’ given X*. The proof, that the joint Bayesian modelling of both the model for Z and the model for Y through an MCMC simulation converges to a proper distribution is provided in this thesis. In a first model-based simulation study, the proposed integrated Bayesian procedure is assessed with regard to the data situation, convergence issues, and underlying assumptions. Special interest lies in the investigation of the interplay of the Y and the Z model within the imputation process. It turns out that failure scenarios can be distinguished by comparing the CIA and the IVA in the completely observed dataset. Finally, both approaches to statistical matching, i.e. the classical approach and the integrated approach, are subject to an extensive comparison in (1) a model-based simulation study and (2) a simulation study based on the AMELIA dataset, which is an openly available very large synthetic dataset and, by construction, similar to the EU-SILC survey. As an additional integrated approach, a Bayesian additive regression trees (BART) model is considered for modelling Y. These integrated procedures are compared to the classical approach represented by predictive mean matching in the form of multiple imputations by chained equation. Suitably chosen, the first simulation framework offers the possibility to clarify aspects related to the underlying assumptions by comparing the IVA and the CIA and by evaluating the impact of the matching variables. Thus, within this simulation study two related aspects are of special interest: the assumptions underlying each method and the incorporation of additional matching variables. The simulation on the AMELIA dataset offers a close-to-reality framework with the advantage of knowing the whole setting, i.e. the whole data X, Y and Z. Special interest lies in investigating assumptions through adding and excluding auxiliary variables in order to enhance conditional independence and assess the sensitivity of the methods to this issue. Furthermore, the benefit of having an overlap of units in data A and B for which information on X, Y, Z is available is investigated. It turns out that the integrated approach yields better results than the classical approach when the CIA clearly does not hold. Moreover, even when the classical approach obtains unbiased results for the regression coefficient of Y in the model for Z, it is the method relying on BART that over all coefficients performs best. Concluding, this work constitutes a major contribution to the clarification of assumptions essential to any statistical matching procedure. By introducing graphical models to identify existing approaches to statistical matching combined with the subsequent analysis of the matched dataset, it offers an extensive overview, categorisation and extension of theory and application. Furthermore, in a setting where none of the assumptions are testable (since X, Y and Z are not observed together), the integrated approach is a valuable asset by offering an alternative to the CIA.

Computational Techniques for Minimum Sum-of-Squares Clustering, Cardinality-Constrained Optimization, and Robust Clustering Problems (2022)

Moreira Costa, Carina

This thesis is concerned with two classes of optimization problems which stem mainly from statistics: clustering problems and cardinality-constrained optimization problems. We are particularly interested in the development of computational techniques to exactly or heuristically solve instances of these two classes of optimization problems. The minimum sum-of-squares clustering (MSSC) problem is widely used to find clusters within a set of data points. The problem is also known as the $k$-means problem, since the most prominent heuristic to compute a feasible point of this optimization problem is the $k$-means method. In many modern applications, however, the clustering suffers from uncertain input data due to, e.g., unstructured measurement errors. The reason for this is that the clustering result then represents a clustering of the erroneous measurements instead of retrieving the true underlying clustering structure. We address this issue by applying robust optimization techniques: we derive the strictly and $\Gamma$-robust counterparts of the MSSC problem, which are as challenging to solve as the original model. Moreover, we develop alternating direction methods to quickly compute feasible points of good quality. Our experiments reveal that the more conservative strictly robust model consistently provides better clustering solutions than the nominal and the less conservative $\Gamma$-robust models. In the context of clustering problems, however, using only a heuristic solution comes with severe disadvantages regarding the interpretation of the clustering. This motivates us to study globally optimal algorithms for the MSSC problem. We note that although some algorithms have already been proposed for this problem, it is still far from being “practically solved”. Therefore, we propose mixed-integer programming techniques, which are mainly based on geometric ideas and which can be incorporated in a branch-and-cut based algorithm tailored to the MSSC problem. Our numerical experiments show that these techniques significantly improve the solution process of a state-of-the-art MINLP solver when applied to the problem. We then turn to the study of cardinality-constrained optimization problems. We consider two famous problem instances of this class: sparse portfolio optimization and sparse regression problems. In many modern applications, it is common to consider problems with thousands of variables. Therefore, globally optimal algorithms are not always computationally viable and the study of sophisticated heuristics is very desirable. Since these problems have a discrete-continuous structure, decomposition methods are particularly well suited. We then apply a penalty alternating direction method that explores this structure and provides very good feasible points in a reasonable amount of time. Our computational study shows that our methods are competitive to state-of-the-art solvers and heuristics.

Cumulative Meta-Analysis. Robustness of Evidence in Survey Methodology (2022)

Burgard, Tanja

Surveys play a major role in studying social and behavioral phenomena that are difficult to observe. Survey data provide insights into the determinants and consequences of human behavior and social interactions. Many domains rely on high quality survey data for decision making and policy implementation including politics, health, business, and the social sciences. Given a certain research question in a specific context, finding the most appropriate survey design to ensure data quality and keep fieldwork costs low at the same time is a difficult task. The aim of examining survey research methodology is to provide the best evidence to estimate the costs and errors of different survey design options. The goal of this thesis is to support and optimize the accumulation and sustainable use of evidence in survey methodology in four steps: (1) Identifying the gaps in meta-analytic evidence in survey methodology by a systematic review of the existing evidence along the dimensions of a central framework in the field (2) Filling in these gaps with two meta-analyses in the field of survey methodology, one on response rates in psychological online surveys, the other on panel conditioning effects for sensitive items (3) Assessing the robustness and sufficiency of the results of the two meta-analyses (4) Proposing a publication format for the accumulation and dissemination of metaanalytic evidence

Detection of Individual Tree Stems Using ALS and its Potential for Forest Research (2022)

Lamprecht, Sebastian ; Hill, Andreas ; Stoffels, Johannes ; Dotzler, Sandra ; Erik, Haß ; Udelhoven, Thomas

Forest inventories provide significant monitoring information on forest health, biodiversity, resilience against disturbance, as well as its biomass and timber harvesting potential. For this purpose, modern inventories increasingly exploit the advantages of airborne laser scanning (ALS) and terrestrial laser scanning (TLS). Although tree crown detection and delineation using ALS can be seen as a mature discipline, the identification of individual stems is a rarely addressed task. In particular, the informative value of the stem attributes—especially the inclination characteristics—is hardly known. In addition, a lack of tools for the processing and fusion of forest-related data sources can be identified. The given thesis addresses these research gaps in four peer-reviewed papers, while a focus is set on the suitability of ALS data for the detection and analysis of tree stems. In addition to providing a novel post-processing strategy for geo-referencing forest inventory plots, the thesis could show that ALS-based stem detections are very reliable and their positions are accurate. In particular, the stems have shown to be suited to study prevailing trunk inclination angles and orientations, while a species-specific down-slope inclination of the tree stems and a leeward orientation of conifers could be observed.

Economic Effects of Television Exposure, Social Media and Habit Formation (2022)

Hartmann, Sven Alfred

Broadcast media such as television have spread rapidly worldwide in the last century. They provide viewers with access to new information and also represent a source of entertainment that unconsciously exposes them to different social norms and moral values. Although the potential impact of exposure to television content have been studied intensively in economic research in recent years, studies examining the long-term causal effects of media exposure are still rare. Therefore, Chapters 2 to 4 of this thesis contribute to the better understanding of long-term effects of television exposure. Chapter 2 empirically investigates whether access to reliable environmental information through television can influence individuals' environmental awareness and pro-environmental behavior. Analyzing exogenous variation in Western television reception in the German Democratic Republic shows that access to objective reporting on environmental pollution can enhance concerns regarding pollution and affect the likelihood of being active in environmental interest groups. Chapter 3 utilizes the same natural experiment and explores the relationship between exposure to foreign mass media content and xenophobia. In contrast to the state television broadcaster in the German Democratic Republic, West German television regularly confronted its viewers with foreign (non-German) broadcasts. By applying multiple measures for xenophobic attitudes, our findings indicate a persistent mitigating impact of foreign media content on xenophobia. Chapter 4 deals with another unique feature of West German television. In contrast to East German media, Western television programs regularly exposed their audience to unmarried and childless characters. The results suggest that exposure to different gender stereotypes contained in television programs can affect marriage, divorce, and birth rates. However, our findings indicate that mainly women were affected by the exposure to unmarried and childless characters. Chapter 5 examines the influence of social media marketing on crowd participation in equity crowdfunding. By analyzing 26,883 investment decisions on three German equity crowdfunding platforms, our results show that startups can influence the success of their equity crowdfunding campaign through social media posts on Facebook and Twitter. In Chapter 6, we incorporate the concept of habit formation into the theoretical literature on trade unions and contribute to a better understanding of how internal habit preferences influence trade union behavior. The results reveal that such internal reference points lead trade unions to raise wages over time, which in turn reduces employment. Conducting a numerical example illustrates that the wage effects and the decline in employment can be substantial.

Existence, Uniqueness, and Algorithms for Equilibria in Competitive Energy Markets (2022)

Grübel, Julia

Due to the transition towards climate neutrality, energy markets are rapidly evolving. New technologies are developed that allow electricity from renewable energy sources to be stored or to be converted into other energy commodities. As a consequence, new players enter the markets and existing players gain more importance. Market equilibrium problems are capable of capturing these changes and therefore enable us to answer contemporary research questions with regard to energy market design and climate policy. This cumulative dissertation is devoted to the study of different market equilibrium problems that address such emerging aspects in liberalized energy markets. In the first part, we review a well-studied competitive equilibrium model for energy commodity markets and extend this model by sector coupling, by temporal coupling, and by a more detailed representation of physical laws and technical requirements. Moreover, we summarize our main contributions of the last years with respect to analyzing the market equilibria of the resulting equilibrium problems. For the extension regarding sector coupling, we derive sufficient conditions for ensuring uniqueness of the short-run equilibrium a priori and for verifying uniqueness of the long-run equilibrium a posteriori. Furthermore, we present illustrative examples that each of the derived conditions is indeed necessary to guarantee uniqueness in general. For the extension regarding temporal coupling, we provide sufficient conditions for ensuring uniqueness of demand and production a priori. These conditions also imply uniqueness of the short-run equilibrium in case of a single storage operator. However, in case of multiple storage operators, examples illustrate that charging and discharging decisions are not unique in general. We conclude the equilibrium analysis with an a posteriori criterion for verifying uniqueness of a given short-run equilibrium. Since the computation of equilibria is much more challenging due to the temporal coupling, we shortly review why a tailored parallel and distributed alternating direction method of multipliers enables to efficiently compute market equilibria. For the extension regarding physical laws and technical requirements, we show that, in nonconvex settings, existence of an equilibrium is not guaranteed and that the fundamental welfare theorems therefore fail to hold. In addition, we argue that the welfare theorems can be re-established in a market design in which the system operator is committed to a welfare objective. For the case of a profit-maximizing system operator, we propose an algorithm that indicates existence of an equilibrium and that computes an equilibrium in the case of existence. Based on well-known instances from the literature on the gas and electricity sector, we demonstrate the broad applicability of our algorithm. Our computational results suggest that an equilibrium often exists for an application involving nonconvex but continuous stationary gas physics. In turn, integralities introduced due to the switchability of DC lines in DC electricity networks lead to many instances without an equilibrium. Finally, we state sufficient conditions under which the gas application has a unique equilibrium and the line switching application has finitely many. In the second part, all preprints belonging to this cumulative dissertation are provided. These preprints, as well as two journal articles to which the author of this thesis contributed, are referenced within the extended summary in the first part and contain more details.

Generalized Synchronization and Intersection Non-Emptiness of Finite-State Automata (2022)

Wolf, Petra

The main focus of this work is to study the computational complexity of generalizations of the synchronization problem for deterministic finite automata (DFA). This problem asks for a given DFA, whether there exists a word w that maps each state of the automaton to one state. We call such a word w a synchronizing word. A synchronizing word brings a system from an unknown configuration into a well defined configuration and thereby resets the system. We generalize this problem in four different ways. First, we restrict the set of potential synchronizing words to a fixed regular language associated with the synchronization under regular constraint problem. The motivation here is to control the structure of a synchronizing word so that, for instance, it first brings the system from an operate mode to a reset mode and then finally again into the operate mode. The next generalization concerns the order of states in which a synchronizing word transitions the automaton. Here, a DFA A and a partial order R is given as input and the question is whether there exists a word that synchronizes A and for which the induced state order is consistent with R. Thereby, we study different ways for a word to induce an order on the state set. Then, we change our focus from DFAs to push-down automata and generalize the synchronization problem to push-down automata and in the following work, to visibly push-down automata. Here, a synchronizing word still needs to map each state of the automaton to one state but it further needs to fulfill some constraints on the stack. We study three different types of stack constraints where after reading the synchronizing word, the stacks associated to each run in the automaton must be (1) empty, (2) identical, or (3) can be arbitrary. We observe that the synchronization problem for general push-down automata is undecidable and study restricted sub-classes of push-down automata where the problem becomes decidable. For visibly push-down automata we even obtain efficient algorithms for some settings. The second part of this work studies the intersection non-emptiness problem for DFAs. This problem is related to the problem of whether a given DFA A can be synchronized into a state q as we can see the set of words synchronizing A into q as the intersection of languages accepted by automata obtained by copying A with different initial states and q as their final state. For the intersection non-emptiness problem, we first study the complexity of the, in general PSPACE-complete, problem restricted to subclasses of DFAs associated with the two well known Straubing-Thérien and Cohen-Brzozowski dot-depth hierarchies. Finally, we study the problem whether a given minimal DFA A can be represented as the intersection of a finite set of smaller DFAs such that the language L(A) accepted by A is equal to the intersection of the languages accepted by the smaller DFAs. There, we focus on the subclass of permutation and commutative permutation DFAs and improve known complexity bounds.

How firm value reacts to exogenous shocks: evidence from the announcements of takeover defenses and the propagation of Covid-19 (2022)

Ulrich, Lennart

For decades, academics and practitioners aim to understand whether and how (economic) events affect firm value. Optimally, these events occur exogenously, i.e. suddenly and unexpectedly, so that an accurate evaluation of the effects on firm value can be conducted. However, recent studies show that even the evaluation of exogenous events is often prone to many challenges that can lead to diverse interpretations, resulting in heated debates. Recently, there have been intense debates in particular on the impact of takeover defenses and of Covid-19 on firm value. The announcements of takeover defenses and the propagation of Covid-19 are exogenous events that occur worldwide and are economically important, but have been insufficiently examined. By answering open research questions, this dissertation aims to provide a greater understanding about the heterogeneous effects that exogenous events such as the announcements of takeover defenses and the propagation of Covid-19 have on firm value. In addition, this dissertation analyzes the influence of certain firm characteristics on the effects of these two exogenous events and identifies influencing factors that explain contradictory results in the existing literature and thus can reconcile different views.

Hybrid Modelling of Dynamical Systems in Mechanics (2022)

Sokolowski, Jan

Hybrid Modelling in general, describes the combination of at least two different methods to solve one specific task. As far as this work is concerned, Hybrid Models describe an approach to combine sophisticated, well-studied mathematical methods with Deep Neural Networks to solve parameter estimation tasks. To combine these two methods, the data structure of artifi- cially generated acceleration data of an approximate vehicle model, the Quarter-Car-Model, is exploited. Acceleration of individual components within a coupled dynamical system, can be described as a second order ordinary differential equation, including velocity and dis- placement of coupled states, scaled by spring - and damping-coefficient of the system. An appropriate numerical integration scheme can then be used to simulate discrete acceleration profiles of the Quarter-Car-Model with a random variation of the parameters of the system. Given explicit knowledge about the data structure, one can then investigate under which con- ditions it is possible to estimate the parameters of the dynamical system for a set of randomly generated data samples. We test, if Neural Networks are capable to solve parameter estima- tion problems in general, or if they can be used to solve several sub-tasks, which support a state-of-the-art parameter estimation method. Hybrid Models are presented for parameter estimation under uncertainties, including for instance measurement noise or incompleteness of measurements, which combine knowledge about the data structure and several Neural Networks for robust parameter estimation within a dynamical system.

Issues in Price Measurement (2022)

Shumskikh, Alena

This thesis focuses on the issues in price measurement and consists of three chapters. Due to outdated weighting information, a Laspeyres-based consumer price index (CPI) is prone to accumulating upward bias. Therefore, chapter 1 introduces and examines simple and transparent revision approaches that retrospectively address the source of the bias. They provide a consistent long-run time series of the CPI and require no additional information. Furthermore, a coherent decomposition of the bias into the contributions of individual product groups is developed. In a case study, the approaches are applied to a Laspeyres-based CPI. The empirical results confirm the theoretical predictions. The proposed revision approaches are adoptable not only to most national CPIs but also to other price-level measures such as the producer price index or the import and export price indices. Chapter 2 is dedicated to the measurement of import and export price indices. Such indices are complicated by the impact of exchange rates. These indices are usually also compiled by some Laspeyres type index. Therefore, substitution bias is an issue. The terms of trade (ratio of export and import price index) are therefore also likely to be distorted. The underlying substitution bias accumulates over time. The present article applies a simple and transparent retroactive correction approach that addresses the source of the substitution bias and produces meaningful long-run time series of import and export price levels and, therefore, of the terms of trade. Furthermore, an empirical case study is conducted that demonstrates the efficacy and versatility of the correction approach. Chapter 3 leaves the field of index revision and studies another issue in price measurement, namely, the economic evaluation of digital products in monetary terms that have zero market prices. This chapter explores different methods of economic valuation and pricing of free digital products and proposes an alternative way to calculate the economic value and a shadow price of free digital products: the Usage Cost Model (UCM). The goal of the chapter is, first of all, to formulate a theoretical framework and incorporate an alternative measure of the value of free digital products. However, an empirical application is also made to show the work of the theoretical model. Some conclusions on applicability are drawn at the end of the chapter.

Refine

Author

Year of publication

Document Type

Language

Keywords

Institute

21 search hits