Refine
Keywords
- Optimal Multivariate Allocation (1) (remove)
The dissertation deals with methods to improve design-based and model-assisted estimation techniques for surveys in a finite population framework. The focus is on the development of the statistical methodology as well as their implementation by means of tailor-made numerical optimization strategies. In that regard, the developed methods aim at computing statistics for several potentially conflicting variables of interest at aggregated and disaggregated levels of the population on the basis of one single survey. The work can be divided into two main research questions, which are briefly explained in the following sections.
First, an optimal multivariate allocation method is developed taking into account several stratification levels. This approach results in a multi-objective optimization problem due to the simultaneous consideration of several variables of interest. In preparation for the numerical solution, several scalarization and standardization techniques are presented, which represent the different preferences of potential users. In addition, it is shown that by solving the problem scalarized with a weighted sum for all combinations of weights, the entire Pareto frontier of the original problem can be generated. By exploiting the special structure of the problem, the scalarized problems can be efficiently solved by a semismooth Newton method. In order to apply this numerical method to other scalarization techniques as well, an alternative approach is suggested, which traces the problem back to the weighted sum case. To address regional estimation quality requirements at multiple stratification levels, the potential use of upper bounds for regional variances is integrated into the method. In addition to restrictions on regional estimates, the method enables the consideration of box-constraints for the stratum-specific sample sizes, allowing minimum and maximum stratum-specific sampling fractions to be defined.
In addition to the allocation method, a generalized calibration method is developed, which is supposed to achieve coherent and efficient estimates at different stratification levels. The developed calibration method takes into account a very large number of benchmarks at different stratification levels, which may be obtained from different sources such as registers, paradata or other surveys using different estimation techniques. In order to incorporate the heterogeneous quality and the multitude of benchmarks, a relaxation of selected benchmarks is proposed. In that regard, predefined tolerances are assigned to problematic benchmarks at low aggregation levels in order to avoid an exact fulfillment. In addition, the generalized calibration method allows the use of box-constraints for the correction weights in order to avoid an extremely high variation of the weights. Furthermore, a variance estimation by means of a rescaling bootstrap is presented.
Both developed methods are analyzed and compared with existing methods in extensive simulation studies on the basis of a realistic synthetic data set of all households in Germany. Due to the similar requirements and objectives, both methods can be successively applied to a single survey in order to combine their efficiency advantages. In addition, both methods can be solved in a time-efficient manner using very comparable optimization approaches. These are based on transformations of the optimality conditions. The dimension of the resulting system of equations is ultimately independent of the dimension of the original problem, which enables the application even for very large problem instances.