Es werden die mathematischen Methoden und algorithmischen Verfahren der Clusteranalyse im Hinblick auf Bedeutungsrepräsentationen untersucht. Im Rahmen der deskriptiven und explorativen Datenanalyse werden die Voraussetzungen und Bedingungen des clusteranalytischen Ansatzes und die Möglichkeiten seiner Anwendung diskutiert, die zur adäquaten Ermittlung und Beschreibung von Gruppierungen von Bedeutungspunkten im semantischen Raum verwendet werden, welche nach räumlicher Lage und topologischen Nachbarschaften den Ähnlichkeiten von Bedeutungen sprachlicher Zeichen in Texten entsprechen. Dabei ist die große Anzahl frei wählbarer Parameter und der Einfluß, den jede Wahl eines der bekannten clusteranalytischen Verfahren in Bezug auf die vorauszusetzenden Vorkenntnisse von der Struktur der zu untersuchenden Daten auf die Güte der erwartbaren Ergebnisse hat, eine bekannte Schwäche der Clusteranalyse. Diese generelle Problematik belastet die Abschätzbarkeit von Erfolg und Adäquatheit unüberwachter Klassifikationsverfahren weit über die quantitativ-linguistischen Untersuchungen in der Gebrauchssemantik hinaus. Deshalb wird ein neues Verfahren entwickelt, welches den analysierten Daten in geringerem Maße als bisher Strukturen aufprägt und in höherem Maße als bisher von den analysierten Daten und ihren Strukturen gesteuert wird.
This dissertation includes three research articles on the portfolio risks of private investors. In the first article, we analyze a large data set of private banking portfolios in Switzerland of a major bank with the unique feature that parts of the portfolios were managed by the bank, and parts were advisory portfolios. To correct the heterogeneity of individual investors, we apply a mixture model and a cluster analysis. Our results suggest that there is indeed a substantial group of advised individual investors that outperform the bank managed portfolios, at least after fees. However, a simple passive strategy that invests in the MSCI World and a risk-free asset significantly outperforms both the better advisory and the bank managed portfolios. The new regulation of the EU for financial products (UCITS IV) prescribes Value at Risk (VaR) as the benchmark for assessing the risk of structured products. The second article discusses the limitations of this approach and shows that, in theory, the expected return of structured products can be unbounded while the VaR requirement for the lowest risk class can still be satisfied. Real-life examples of large returns within the lowest risk class are then provided. The results demonstrate that the new regulation could lead to new seemingly safe products that hide large risks. Behavioral investors who choose products based only on their official risk classes and their expected returns will, therefore, invest into suboptimal products. To overcome these limitations, we suggest a new risk-return measure for financial products based on the martingale measure that could erase such loopholes. Under the mean-VaR framework, the third article discusses the impacts of the underlying's first four moments on the structured product. By expanding the expected return and the VaR of a structured product with its underlying moments, it is possible to investigate each moment's impact on them, simultaneously. Results are tested by Monte Carlo simulation and historical simulation. The findings show that for the majority of structured products, underlyings with large positive skewness are preferred. The preferences for variance and for kurtosis are ambiguous.
This thesis is concerned with two classes of optimization problems which stem
mainly from statistics: clustering problems and cardinality-constrained optimization problems. We are particularly interested in the development of computational techniques to exactly or heuristically solve instances of these two classes
of optimization problems.
The minimum sum-of-squares clustering (MSSC) problem is widely used
to find clusters within a set of data points. The problem is also known as
the $k$-means problem, since the most prominent heuristic to compute a feasible
point of this optimization problem is the $k$-means method. In many modern
applications, however, the clustering suffers from uncertain input data due to,
e.g., unstructured measurement errors. The reason for this is that the clustering
result then represents a clustering of the erroneous measurements instead of
retrieving the true underlying clustering structure. We address this issue by
applying robust optimization techniques: we derive the strictly and $\Gamma$-robust
counterparts of the MSSC problem, which are as challenging to solve as the
original model. Moreover, we develop alternating direction methods to quickly
compute feasible points of good quality. Our experiments reveal that the more
conservative strictly robust model consistently provides better clustering solutions
than the nominal and the less conservative $\Gamma$-robust models.
In the context of clustering problems, however, using only a heuristic solution
comes with severe disadvantages regarding the interpretation of the clustering.
This motivates us to study globally optimal algorithms for the MSSC problem.
We note that although some algorithms have already been proposed for this
problem, it is still far from being “practically solved”. Therefore, we propose
mixed-integer programming techniques, which are mainly based on geometric
ideas and which can be incorporated in a
branch-and-cut based algorithm tailored
to the MSSC problem. Our numerical experiments show that these techniques
significantly improve the solution process of a
state-of-the-art MINLP solver
when applied to the problem.
We then turn to the study of cardinality-constrained optimization problems.
We consider two famous problem instances of this class: sparse portfolio optimization and sparse regression problems. In many modern applications, it is common
to consider problems with thousands of variables. Therefore, globally optimal
algorithms are not always computationally viable and the study of sophisticated
heuristics is very desirable. Since these problems have a discrete-continuous
structure, decomposition methods are particularly well suited. We then apply a
penalty alternating direction method that explores this structure and provides
very good feasible points in a reasonable amount of time. Our computational
study shows that our methods are competitive to
state-of-the-art solvers and heuristics.