Filtern
Dokumenttyp
- Dissertation (16)
- Wissenschaftlicher Artikel (1)
Sprache
- Englisch (17) (entfernen)
Schlagworte
- Optimierung (17) (entfernen)
Institut
- Fachbereich 4 (6)
- Mathematik (6)
- Fachbereich 6 (1)
A matrix A is called completely positive if there exists an entrywise nonnegative matrix B such that A = BB^T. These matrices can be used to obtain convex reformulations of for example nonconvex quadratic or combinatorial problems. One of the main problems with completely positive matrices is checking whether a given matrix is completely positive. This is known to be NP-hard in general. rnrnFor a given matrix completely positive matrix A, it is nontrivial to find a cp-factorization A=BB^T with nonnegative B since this factorization would provide a certificate for the matrix to be completely positive. But this factorization is not only important for the membership to the completely positive cone, it can also be used to recover the solution of the underlying quadratic or combinatorial problem. In addition, it is not a priori known how many columns are necessary to generate a cp-factorization for the given matrix. The minimal possible number of columns is called the cp-rank of A and so far it is still an open question how to derive the cp-rank for a given matrix. Some facts on completely positive matrices and the cp-rank will be given in Chapter 2. Moreover, in Chapter 6, we will see a factorization algorithm, which, for a given completely positive matrix A and a suitable starting point, computes the nonnegative factorization A=BB^T. The algorithm therefore returns a certificate for the matrix to be completely positive. As introduced in Chapter 3, the fundamental idea of the factorization algorithm is to start from an initial square factorization which is not necessarily entrywise nonnegative, and extend this factorization to a matrix for which the number of columns is greater than or equal to the cp-rank of A. Then it is the goal to transform this generated factorization into a cp-factorization. This problem can be formulated as a nonconvex feasibility problem, as shown in Section 4.1, and solved by a method which is based on alternating projections, as proven in Chapter 6. On the topic of alternating projections, a survey will be given in Chapter 5. Here we will see how to apply this technique to several types of sets like subspaces, convex sets, manifolds and semialgebraic sets. Furthermore, we will see some known facts on the convergence rate for alternating projections between these types of sets. Considering more than two sets yields the so called cyclic projections approach. Here some known facts for subspaces and convex sets will be shown. Moreover, we will see a new convergence result on cyclic projections among a sequence of manifolds in Section 5.4. In the context of cp-factorizations, a local convergence result for the introduced algorithm will be given. This result is based on the known convergence for alternating projections between semialgebraic sets. To obtain cp-facrorizations with this first method, it is necessary to solve a second order cone problem in every projection step, which is very costly. Therefore, in Section 6.2, we will see an additional heuristic extension, which improves the numerical performance of the algorithm. Extensive numerical tests in Chapter 7 will show that the factorization method is very fast in most instances. In addition, we will see how to derive a certificate for the matrix to be an element of the interior of the completely positive cone. As a further application, this method can be extended to find a symmetric nonnegative matrix factorization, where we consider an additional low-rank constraint. Here again, the method to derive factorizations for completely positive matrices can be used, albeit with some further adjustments, introduced in Section 8.1. Moreover, we will see that even for the general case of deriving a nonnegative matrix factorization for a given rectangular matrix A, the key aspects of the completely positive factorization approach can be used. To this end, it becomes necessary to extend the idea of finding a completely positive factorization such that it can be used for rectangular matrices. This yields an applicable algorithm for nonnegative matrix factorization in Section 8.2. Numerical results for this approach will suggest that the presented algorithms and techniques to obtain completely positive matrix factorizations can be extended to general nonnegative factorization problems.
Die Dissertation beschäftigt sich mit einer neuartigen Art von Branch-and-Bound Algorithmen, deren Unterschied zu klassischen Branch-and-Bound Algorithmen darin besteht, dass
das Branching durch die Addition von nicht-negativen Straftermen zur Zielfunktion erfolgt
anstatt durch das Hinzufügen weiterer Nebenbedingungen. Die Arbeit zeigt die theoretische Korrektheit des Algorithmusprinzips für verschiedene allgemeine Klassen von Problemen und evaluiert die Methode für verschiedene konkrete Problemklassen. Für diese Problemklassen, genauer Monotone und Nicht-Monotone Gemischtganzzahlige Lineare Komplementaritätsprobleme und Gemischtganzzahlige Lineare Probleme, präsentiert die Arbeit
verschiedene problemspezifische Verbesserungsmöglichkeiten und evaluiert diese numerisch.
Weiterhin vergleicht die Arbeit die neue Methode mit verschiedenen Benchmark-Methoden
mit größtenteils guten Ergebnissen und gibt einen Ausblick auf weitere Anwendungsgebiete
und zu beantwortende Forschungsfragen.
We consider a linear regression model for which we assume that some of the observed variables are irrelevant for the prediction. Including the wrong variables in the statistical model can either lead to the problem of having too little information to properly estimate the statistic of interest, or having too much information and consequently describing fictitious connections. This thesis considers discrete optimization to conduct a variable selection. In light of this, the subset selection regression method is analyzed. The approach gained a lot of interest in recent years due to its promising predictive performance. A major challenge associated with the subset selection regression is the computational difficulty. In this thesis, we propose several improvements for the efficiency of the method. Novel bounds on the coefficients of the subset selection regression are developed, which help to tighten the relaxation of the associated mixed-integer program, which relies on a Big-M formulation. Moreover, a novel mixed-integer linear formulation for the subset selection regression based on a bilevel optimization reformulation is proposed. Finally, it is shown that the perspective formulation of the subset selection regression is equivalent to a state-of-the-art binary formulation. We use this insight to develop novel bounds for the subset selection regression problem, which show to be highly effective in combination with the proposed linear formulation.
In the second part of this thesis, we examine the statistical conception of the subset selection regression and conclude that it is misaligned with its intention. The subset selection regression uses the training error to decide on which variables to select. The approach conducts the validation on the training data, which oftentimes is not a good estimate of the prediction error. Hence, it requires a predetermined cardinality bound. Instead, we propose to select variables with respect to the cross-validation value. The process is formulated as a mixed-integer program with the sparsity becoming subject of the optimization. Usually, a cross-validation is used to select the best model out of a few options. With the proposed program the best model out of all possible models is selected. Since the cross-validation is a much better estimate of the prediction error, the model can select the best sparsity itself.
The thesis is concluded with an extensive simulation study which provides evidence that discrete optimization can be used to produce highly valuable predictive models with the cross-validation subset selection regression almost always producing the best results.
Competitive analysis is a well known method for analyzing online algorithms.
Two online optimization problems, the scheduling problems and the list accessing problems, are considered in the thesis of Yida Zhu in the respect of this method.
For both problems, several existing online and offline algorithms are studied. Their performances are compared with the performances of corresponding offline optimal algorithms.
In particular, the list accessing algorithm BIT is carefully reviewed.
The classical proof of its worst case performance get simplified by adapting the knowledge about the optimal offline algorithm.
With regard to average case analysis, a new closed formula is developed to determine the performance of BIT on specific class of instances.
All algorithm considered in this thesis are also implemented in Julia.
Their empirical performances are studied and compared with each other directly.
This thesis is concerned with two classes of optimization problems which stem
mainly from statistics: clustering problems and cardinality-constrained optimization problems. We are particularly interested in the development of computational techniques to exactly or heuristically solve instances of these two classes
of optimization problems.
The minimum sum-of-squares clustering (MSSC) problem is widely used
to find clusters within a set of data points. The problem is also known as
the $k$-means problem, since the most prominent heuristic to compute a feasible
point of this optimization problem is the $k$-means method. In many modern
applications, however, the clustering suffers from uncertain input data due to,
e.g., unstructured measurement errors. The reason for this is that the clustering
result then represents a clustering of the erroneous measurements instead of
retrieving the true underlying clustering structure. We address this issue by
applying robust optimization techniques: we derive the strictly and $\Gamma$-robust
counterparts of the MSSC problem, which are as challenging to solve as the
original model. Moreover, we develop alternating direction methods to quickly
compute feasible points of good quality. Our experiments reveal that the more
conservative strictly robust model consistently provides better clustering solutions
than the nominal and the less conservative $\Gamma$-robust models.
In the context of clustering problems, however, using only a heuristic solution
comes with severe disadvantages regarding the interpretation of the clustering.
This motivates us to study globally optimal algorithms for the MSSC problem.
We note that although some algorithms have already been proposed for this
problem, it is still far from being “practically solved”. Therefore, we propose
mixed-integer programming techniques, which are mainly based on geometric
ideas and which can be incorporated in a
branch-and-cut based algorithm tailored
to the MSSC problem. Our numerical experiments show that these techniques
significantly improve the solution process of a
state-of-the-art MINLP solver
when applied to the problem.
We then turn to the study of cardinality-constrained optimization problems.
We consider two famous problem instances of this class: sparse portfolio optimization and sparse regression problems. In many modern applications, it is common
to consider problems with thousands of variables. Therefore, globally optimal
algorithms are not always computationally viable and the study of sophisticated
heuristics is very desirable. Since these problems have a discrete-continuous
structure, decomposition methods are particularly well suited. We then apply a
penalty alternating direction method that explores this structure and provides
very good feasible points in a reasonable amount of time. Our computational
study shows that our methods are competitive to
state-of-the-art solvers and heuristics.
Many combinatorial optimization problems on finite graphs can be formulated as conic convex programs, e.g. the stable set problem, the maximum clique problem or the maximum cut problem. Especially NP-hard problems can be written as copositive programs. In this case the complexity is moved entirely into the copositivity constraint.
Copositive programming is a quite new topic in optimization. It deals with optimization over the so-called copositive cone, a superset of the positive semidefinite cone, where the quadratic form x^T Ax has to be nonnegative for only the nonnegative vectors x. Its dual cone is the cone of completely positive matrices, which includes all matrices that can be decomposed as a sum of nonnegative symmetric vector-vector-products.
The related optimization problems are linear programs with matrix variables and cone constraints.
However, some optimization problems can be formulated as combinatorial problems on infinite graphs. For example, the kissing number problem can be formulated as a stable set problem on a circle.
In this thesis we will discuss how the theory of copositive optimization can be lifted up to infinite dimension. For some special cases we will give applications in combinatorial optimization.
This work investigates the industrial applicability of graphics and stream processors in the field of fluid simulations. For this purpose, an explicit Runge-Kutta discontinuous Galerkin method in arbitrarily high order is implemented completely for the hardware architecture of GPUs. The same functionality is simultaneously realized for CPUs and compared to GPUs. Explicit time steppings as well as established implicit methods are under consideration for the CPU. This work aims at the simulation of inviscid, transsonic flows over the ONERA M6 wing. The discontinuities which typically arise in hyperbolic equations are treated with an artificial viscosity approach. It is further investigated how this approach fits into the explicit time stepping and works together with the special architecture of the GPU. Since the treatment of artificial viscosity is close to the simulation of the Navier-Stokes equations, it is reviewed how GPU-accelerated methods could be applied for computing viscous flows. This work is based on a nodal discontinuous Galerkin approach for linear hyperbolic problems. Here, it is extended to non-linear problems, which makes the application of numerical quadrature obligatory. Moreover, the representation of complex geometries is realized using isoparametric mappings. Higher order methods are typically very sensitive with respect to boundaries which are not properly resolved. For this purpose, an approach is presented which fits straight-sided DG meshes to curved geometries which are described by NURBS surfaces. The mesh is modeled as an elastic body and deformed according to the solution of closest point problems in order to minimize the gap to the original spline surface. The sensitivity with respect to geometry representations is reviewed in the end of this work in the context of shape optimization. Here, the aerodynamic drag of the ONERA M6 wing is minimized according to the shape gradient which is implicitly smoothed within the mesh deformation approach. In this context a comparison to the classical Laplace-Beltrami operator is made in a Stokes flow situation.
Shape optimization is of interest in many fields of application. In particular, shape optimization problems arise frequently in technological processes which are modelled by partial differential equations (PDEs). In a lot of practical circumstances, the shape under investigation is parametrized by a finite number of parameters, which, on the one hand, allows the application of standard optimization approaches, but, on the other hand, unnecessarily limits the space of reachable shapes. Shape calculus presents a way to circumvent this dilemma. However, so far shape optimization based on shape calculus is mainly performed using gradient descent methods. One reason for this is the lack of symmetry of second order shape derivatives or shape Hessians. A major difference between shape optimization and the standard PDE constrained optimization framework is the lack of a linear space structure on shape spaces. If one cannot use a linear space structure, then the next best structure is a Riemannian manifold structure, in which one works with Riemannian shape Hessians. They possess the often sought property of symmetry, characterize well-posedness of optimization problems and define sufficient optimality conditions. In general, shape Hessians are used to accelerate gradient-based shape optimization methods. This thesis deals with shape optimization problems constrained by PDEs and embeds these problems in the framework of optimization on Riemannian manifolds to provide efficient techniques for PDE constrained shape optimization problems on shape spaces. A Lagrange-Newton and a quasi-Newton technique in shape spaces for PDE constrained shape optimization problems are formulated. These techniques are based on the Hadamard-form of shape derivatives, i.e., on the form of integrals over the surface of the shape under investigation. It is often a very tedious, not to say painful, process to derive such surface expressions. Along the way, volume formulations in the form of integrals over the entire domain appear as an intermediate step. This thesis couples volume integral formulations of shape derivatives with optimization strategies on shape spaces in order to establish efficient shape algorithms reducing analytical effort and programming work. In this context, a novel shape space is proposed.
Due to the transition towards climate neutrality, energy markets are rapidly evolving. New technologies are developed that allow electricity from renewable energy sources to be stored or to be converted into other energy commodities. As a consequence, new players enter the markets and existing players gain more importance. Market equilibrium problems are capable of capturing these changes and therefore enable us to answer contemporary research questions with regard to energy market design and climate policy.
This cumulative dissertation is devoted to the study of different market equilibrium problems that address such emerging aspects in liberalized energy markets. In the first part, we review a well-studied competitive equilibrium model for energy commodity markets and extend this model by sector coupling, by temporal coupling, and by a more detailed representation of physical laws and technical requirements. Moreover, we summarize our main contributions of the last years with respect to analyzing the market equilibria of the resulting equilibrium problems.
For the extension regarding sector coupling, we derive sufficient conditions for ensuring uniqueness of the short-run equilibrium a priori and for verifying uniqueness of the long-run equilibrium a posteriori. Furthermore, we present illustrative examples that each of the derived conditions is indeed necessary to guarantee uniqueness in general.
For the extension regarding temporal coupling, we provide sufficient conditions for ensuring uniqueness of demand and production a priori. These conditions also imply uniqueness of the short-run equilibrium in case of a single storage operator. However, in case of multiple storage operators, examples illustrate that charging and discharging decisions are not unique in general. We conclude the equilibrium analysis with an a posteriori criterion for verifying uniqueness of a given short-run equilibrium. Since the computation of equilibria is much more challenging due to the temporal coupling, we shortly review why a tailored parallel and distributed alternating direction method of multipliers enables to efficiently compute market equilibria.
For the extension regarding physical laws and technical requirements, we show that, in nonconvex settings, existence of an equilibrium is not guaranteed and that the fundamental welfare theorems therefore fail to hold. In addition, we argue that the welfare theorems can be re-established in a market design in which the system operator is committed to a welfare objective. For the case of a profit-maximizing system operator, we propose an algorithm that indicates existence of an equilibrium and that computes an equilibrium in the case of existence. Based on well-known instances from the literature on the gas and electricity sector, we demonstrate the broad applicability of our algorithm. Our computational results suggest that an equilibrium often exists for an application involving nonconvex but continuous stationary gas physics. In turn, integralities introduced due to the switchability of DC lines in DC electricity networks lead to many instances without an equilibrium. Finally, we state sufficient conditions under which the gas application has a unique equilibrium and the line switching application has finitely many.
In the second part, all preprints belonging to this cumulative dissertation are provided. These preprints, as well as two journal articles to which the author of this thesis contributed, are referenced within the extended summary in the first part and contain more details.
Optimal control problems are optimization problems governed by ordinary or partial differential equations (PDEs). A general formulation is given byrn \min_{(y,u)} J(y,u) with subject to e(y,u)=0, assuming that e_y^{-1} exists and consists of the three main elements: 1. The cost functional J that models the purpose of the control on the system. 2. The definition of a control function u that represents the influence of the environment of the systems. 3. The set of differential equations e(y,u) modeling the controlled system, represented by the state function y:=y(u) which depends on u. These kind of problems are well investigated and arise in many fields of application, for example robot control, control of biological processes, test drive simulation and shape and topology optimization. In this thesis, an academic model problem of the form \min_{(y,u)} J(y,u):=\min_{(y,u)}\frac{1}{2}\|y-y_d\|^2_{L^2(\Omega)}+\frac{\alpha}{2}\|u\|^2_{L^2(\Omega)} subject to -\div(A\grad y)+cy=f+u in \Omega, y=0 on \partial\Omega and u\in U_{ad} is considered. The objective is tracking type with a given target function y_d and a regularization term with parameter \alpha. The control function u takes effect on the whole domain \Omega. The underlying partial differential equation is assumed to be uniformly elliptic. This problem belongs to the class of linear-quadratic elliptic control problems with distributed control. The existence and uniqueness of an optimal solution for problems of this type is well-known and in a first step, following the paradigm 'first optimize, then discretize', the necessary and sufficient optimality conditions are derived by means of the adjoint equation which ends in a characterization of the optimal solution in form of an optimality system. In a second step, the occurring differential operators are approximated by finite differences and the hence resulting discretized optimality system is solved with a collective smoothing multigrid method (CSMG). In general, there are several optimization methods for solving the optimal control problem: an application of the implicit function theorem leads to so-called black-box approaches where the PDE-constrained optimization problem is transformed into an unconstrained optimization problem and the reduced gradient for these reduced functional is computed via the adjoint approach. Another possibilities are Quasi-Newton methods, which approximate the Hessian by a low-rank update based on gradient evaluations, Krylov-Newton methods or (reduced) SQP methods. The use of multigrid methods for optimization purposes is motivated by its optimal computational complexity, i.e. the number of required computer iterations scales linearly with the number of unknowns and the rate of convergence, which is independent of the grid size. Originally multigrid methods are a class of algorithms for solving linear systems arising from the discretization of partial differential equations. The main part of this thesis is devoted to the investigation of the implementability and the efficiency of the CSMG on commodity graphics cards. GPUs (graphic processing units) are designed for highly parallelizable graphics computations and possess many cores of SIMD-architecture, which are able to outperform the CPU regarding to computational power and memory bandwidth. Here they are considered as prototype for prospective multi-core computers with several hundred of cores. When using GPUs as streamprocessors, two major problems arise: data have to be transferred from the CPU main memory to the GPU main memory, which can be quite slow and the limited size of the GPU main memory. Furthermore, only when the streamprocessors are fully used to capacity, a remarkable speed-up comparing to a CPU is achieved. Therefore, new algorithms for the solution of optimal control problems are designed in this thesis. To this end, a nonoverlapping domain decomposition method is introduced which allows the exploitation of the computational power of many GPUs resp. CPUs in parallel. This algorithm is based on preliminary work for elliptic problems and enhanced for the application to optimal control problems. For the domain decomposition into two subdomains the linear system for the unknowns on the interface is solved with a Schur complement method by using a discrete approximation of the Steklov-Poincare operator. For the academic optimal control problem, the arising capacitance matrix can be inverted analytically. On this basis, two different algorithms for the nonoverlapping domain decomposition for the case of many subdomains are proposed in this thesis: on the one hand, a recursive approach and on the other hand a simultaneous approach. Numerical test compare the performance of the CSMG for the one domain case and the two approaches for the multi-domain case on a GPU and CPU for different variants.