The dissertation deals with methods to improve design-based and model-assisted estimation techniques for surveys in a finite population framework. The focus is on the development of the statistical methodology as well as their implementation by means of tailor-made numerical optimization strategies. In that regard, the developed methods aim at computing statistics for several potentially conflicting variables of interest at aggregated and disaggregated levels of the population on the basis of one single survey. The work can be divided into two main research questions, which are briefly explained in the following sections.
First, an optimal multivariate allocation method is developed taking into account several stratification levels. This approach results in a multi-objective optimization problem due to the simultaneous consideration of several variables of interest. In preparation for the numerical solution, several scalarization and standardization techniques are presented, which represent the different preferences of potential users. In addition, it is shown that by solving the problem scalarized with a weighted sum for all combinations of weights, the entire Pareto frontier of the original problem can be generated. By exploiting the special structure of the problem, the scalarized problems can be efficiently solved by a semismooth Newton method. In order to apply this numerical method to other scalarization techniques as well, an alternative approach is suggested, which traces the problem back to the weighted sum case. To address regional estimation quality requirements at multiple stratification levels, the potential use of upper bounds for regional variances is integrated into the method. In addition to restrictions on regional estimates, the method enables the consideration of box-constraints for the stratum-specific sample sizes, allowing minimum and maximum stratum-specific sampling fractions to be defined.
In addition to the allocation method, a generalized calibration method is developed, which is supposed to achieve coherent and efficient estimates at different stratification levels. The developed calibration method takes into account a very large number of benchmarks at different stratification levels, which may be obtained from different sources such as registers, paradata or other surveys using different estimation techniques. In order to incorporate the heterogeneous quality and the multitude of benchmarks, a relaxation of selected benchmarks is proposed. In that regard, predefined tolerances are assigned to problematic benchmarks at low aggregation levels in order to avoid an exact fulfillment. In addition, the generalized calibration method allows the use of box-constraints for the correction weights in order to avoid an extremely high variation of the weights. Furthermore, a variance estimation by means of a rescaling bootstrap is presented.
Both developed methods are analyzed and compared with existing methods in extensive simulation studies on the basis of a realistic synthetic data set of all households in Germany. Due to the similar requirements and objectives, both methods can be successively applied to a single survey in order to combine their efficiency advantages. In addition, both methods can be solved in a time-efficient manner using very comparable optimization approaches. These are based on transformations of the optimality conditions. The dimension of the resulting system of equations is ultimately independent of the dimension of the original problem, which enables the application even for very large problem instances.
Optimal Control of Partial Integro-Differential Equations and Analysis of the Gaussian Kernel
(2018)
An important field of applied mathematics is the simulation of complex financial, mechanical, chemical, physical or medical processes with mathematical models. In addition to the pure modeling of the processes, the simultaneous optimization of an objective function by changing the model parameters is often the actual goal. Models in fields such as finance, biology or medicine benefit from this optimization step.
While many processes can be modeled using an ordinary differential equation (ODE), partial differential equations (PDEs) are needed to optimize heat conduction and flow characteristics, spreading of tumor cells in tissue as well as option prices. A partial integro-differential equation (PIDE) is a parital differential equation involving an integral operator, e.g., the convolution of the unknown function with a given kernel function. PIDEs occur for example in models that simulate adhesive forces between cells or option prices with jumps.
In each of the two parts of this thesis, a certain PIDE is the main object of interest. In the first part, we study a semilinear PIDE-constrained optimal control problem with the aim to derive necessary optimality conditions. In the second, we analyze a linear PIDE that includes the convolution of the unknown function with the Gaussian kernel.
Sample surveys are a widely used and cost effective tool to gain information about a population under consideration. Nowadays, there is an increasing demand not only for information on the population level but also on the level of subpopulations. For some of these subpopulations of interest, however, very small subsample sizes might occur such that the application of traditional estimation methods is not expedient. In order to provide reliable information also for those so called small areas, small area estimation (SAE) methods combine auxiliary information and the sample data via a statistical model.
The present thesis deals, among other aspects, with the development of highly flexible and close to reality small area models. For this purpose, the penalized spline method is adequately modified which allows to determine the model parameters via the solution of an unconstrained optimization problem. Due to this optimization framework, the incorporation of shape constraints into the modeling process is achieved in terms of additional linear inequality constraints on the optimization problem. This results in small area estimators that allow for both the utilization of the penalized spline method as a highly flexible modeling technique and the incorporation of arbitrary shape constraints on the underlying P-spline function.
In order to incorporate multiple covariates, a tensor product approach is employed to extend the penalized spline method to multiple input variables. This leads to high-dimensional optimization problems for which naive solution algorithms yield an unjustifiable complexity in terms of runtime and in terms of memory requirements. By exploiting the underlying tensor nature, the present thesis provides adequate computationally efficient solution algorithms for the considered optimization problems and the related memory efficient, i.e. matrix-free, implementations. The crucial point thereby is the (repetitive) application of a matrix-free conjugated gradient method, whose runtime is drastically reduced by a matrx-free multigrid preconditioner.
The economic growth theory analyses which factors affect economic growth and tries to analyze how it can last. A popular neoclassical growth model is the Ramsey-Cass-Koopmans model, which aims to determine how much of its income a nation or an economy should save in order to maximize its welfare. In this thesis, we present and analyze an extended capital accumulation equation of a spatial version of the Ramsey model, balancing diffusive and agglomerative effects. We model the capital mobility in space via a nonlocal diffusion operator which allows for jumps of the capital stock from one location to an other. Moreover, this operator smooths out heterogeneities in the factor distributions slower, which generated a more realistic behavior of capital flows. In addition to that, we introduce an endogenous productivity-production operator which depends on time and on the capital distribution in space. This operator models the technological progress of the economy. The resulting mathematical model is an optimal control problem under a semilinear parabolic integro-differential equation with initial and volume constraints, which are a nonlocal analog to local boundary conditions, and box-constraints on the state and the control variables. In this thesis, we consider this problem on a bounded and unbounded spatial domain, in both cases with a finite time horizon. We derive existence results of weak solutions for the capital accumulation equations in both settings and we proof the existence of a Ramsey equilibrium in the unbounded case. Moreover, we solve the optimal control problem numerically and discuss the results in the economic context.