Filtern
Erscheinungsjahr
- 2023 (4) (entfernen)
Dokumenttyp
Volltext vorhanden
- ja (4) (entfernen)
Schlagworte
- Optimierung (4) (entfernen)
Institut
- Fachbereich 4 (1)
- Fachbereich 6 (1)
Die Dissertation beschäftigt sich mit einer neuartigen Art von Branch-and-Bound Algorithmen, deren Unterschied zu klassischen Branch-and-Bound Algorithmen darin besteht, dass
das Branching durch die Addition von nicht-negativen Straftermen zur Zielfunktion erfolgt
anstatt durch das Hinzufügen weiterer Nebenbedingungen. Die Arbeit zeigt die theoretische Korrektheit des Algorithmusprinzips für verschiedene allgemeine Klassen von Problemen und evaluiert die Methode für verschiedene konkrete Problemklassen. Für diese Problemklassen, genauer Monotone und Nicht-Monotone Gemischtganzzahlige Lineare Komplementaritätsprobleme und Gemischtganzzahlige Lineare Probleme, präsentiert die Arbeit
verschiedene problemspezifische Verbesserungsmöglichkeiten und evaluiert diese numerisch.
Weiterhin vergleicht die Arbeit die neue Methode mit verschiedenen Benchmark-Methoden
mit größtenteils guten Ergebnissen und gibt einen Ausblick auf weitere Anwendungsgebiete
und zu beantwortende Forschungsfragen.
The publication of statistical databases is subject to legal regulations, e.g. national statistical offices are only allowed to publish data if the data cannot be attributed to individuals. Achieving this privacy standard requires anonymizing the data prior to publication. However, data anonymization inevitably leads to a loss of information, which should be kept minimal. In this thesis, we analyze the anonymization method SAFE used in the German census in 2011 and we propose a novel integer programming-based anonymization method for nominal data.
In the first part of this thesis, we prove that a fundamental variant of the underlying SAFE optimization problem is NP-hard. This justifies the use of heuristic approaches for large data sets. In the second part, we propose a new anonymization method belonging to microaggregation methods, specifically designed for nominal data. This microaggregation method replaces rows in a microdata set with representative values to achieve k-anonymity, ensuring each data row is identical to at least k − 1 other rows. In addition to the overall dissimilarities of the data rows, the method accounts for errors in resulting frequency tables, which are of high interest for nominal data in practice. The method employs a typical two-step structure: initially partitioning the data set into clusters and subsequently replacing all cluster elements with representative values to achieve k-anonymity. For the partitioning step, we propose a column generation scheme followed by a heuristic to obtain an integer solution, which is based on the dual information. For the aggregation step, we present a mixed-integer problem formulation to find cluster representatives. To this end, we take errors in a subset of frequency tables into account. Furthermore, we show a reformulation of the problem to a minimum edge-weighted maximal clique problem in a multipartite graph, which allows for a different perspective on the problem. Moreover, we formulate a mixed-integer program, which combines the partitioning and the aggregation step and aims to minimize the sum of chi-squared errors in frequency tables.
Finally, an experimental study comparing the methods covered or developed in this work shows particularly strong results for the proposed method with respect to relative criteria, while SAFE shows its strength with respect to the maximum absolute error in frequency tables. We conclude that the inclusion of integer programming in the context of data anonymization is a promising direction to reduce the inevitable information loss inherent in anonymization, particularly for nominal data.
Properties Evaluation of Composite Materials Based on Gypsum Plaster and Posidonia Oceanica Fibers
(2023)
Estimating the amount of material without significant losses at the end of hybrid casting is a problem addressed in this study. To minimize manufacturing costs and improve the accuracy of results, a correction factor (CF) was used in the formula to estimate the volume percent of the material in order to reduce material losses during the sample manufacturing stage, allowing for greater confidence between the approved blending plan and the results obtained. In this context, three material mixing schemes of different sizes and shapes (gypsum plaster, sand (0/2), gravel (2/4), and Posidonia oceanica fibers (PO)) were created to verify the efficiency of CF and more precisely study the physico-mechanical effects on the samples. The results show that the use of a CF can reduce mixing loss to almost 0%. The optimal compressive strength of the sample (S1B) with the lowest mixing loss was 7.50 MPa. Under optimal conditions, the addition of PO improves mix volume percent correction (negligible), flexural strength (5.45%), density (18%), and porosity (3.70%) compared with S1B. On the other hand, the addition of PO thermo-chemical treatment by NaOH increases the compressive strength (3.97%) compared with PO due to the removal of impurities on the fiber surface, as shown by scanning electron microscopy. We then determined the optimal mixture ratio (PO divided by a mixture of plaster, sand, and gravel), which equals 0.0321 because Tunisian gypsum contains small amounts of bassanite and calcite, as shown by the X-ray diffraction results.
Modern decision making in the digital age is highly driven by the massive amount of
data collected from different technologies and thus affects both individuals as well as
economic businesses. The benefit of using these data and turning them into knowledge
requires appropriate statistical models that describe the underlying observations well.
Imposing a certain parametric statistical model goes along with the need of finding
optimal parameters such that the model describes the data best. This often results in
challenging mathematical optimization problems with respect to the model’s parameters
which potentially involve covariance matrices. Positive definiteness of covariance matrices
is required for many advanced statistical models and these constraints must be imposed
for standard Euclidean nonlinear optimization methods which often results in a high
computational effort. As Riemannian optimization techniques proved efficient to handle
difficult matrix-valued geometric constraints, we consider optimization over the manifold
of positive definite matrices to estimate parameters of statistical models. The statistical
models treated in this thesis assume that the underlying data sets used for parameter
fitting have a clustering structure which results in complex optimization problems. This
motivates to use the intrinsic geometric structure of the parameter space. In this thesis,
we analyze the appropriateness of Riemannian optimization over the manifold of positive
definite matrices on two advanced statistical models. We establish important problem-
specific Riemannian characteristics of the two problems and demonstrate the importance
of exploiting the Riemannian geometry of covariance matrices based on numerical studies.