Filtern
Schlagworte
- Nichtlineare Optimierung (4) (entfernen)
Institut
- Mathematik (2)
- Fachbereich 4 (1)
In this thesis, we study the convergence behavior of an efficient optimization method used for the identification of parameters for underdetermined systems. The research is motivated by optimization problems arising from the estimation of parameters in neural networks as well as in option pricing models. In the first application, we are concerned with neural networks used to forecasting stock market indices. Since neural networks are able to describe extremely complex nonlinear structures they are used to improve the modelling of the nonlinear dependencies occurring in the financial markets. Applying neural networks to the forecasting of economic indicators, we are confronted with a nonlinear least squares problem of large dimension. Furthermore, in this application the number of parameters of the neural network to be determined is usually much larger than the number of patterns which are available for the determination of the unknowns. Hence, the residual function of our least squares problem is underdetermined. In option pricing, an important but usually not known parameter is the volatility of the underlying asset of the option. Assuming that the underlying asset follows a one-factor continuous diffusion model with nonconstant drift and volatility term, the value of an European call option satisfies a parabolic initial value problem with the volatility function appearing in one of the coefficients of the parabolic differential equation. Using this system equation, the estimation of the volatility function is described by a nonlinear least squares problem. Since the adaption of the volatility function is based only on a small number of observed market data these problems are naturally ill-posed. For the solution of these large-scale underdetermined nonlinear least squares problems we use a fully iterative inexact Gauss-Newton algorithm. We show how the structure of a neural network as well as that of the European call price model can be exploited using iterative methods. Moreover, we present theoretical statements for the convergence of the inexact Gauss-Newton algorithm applied to the less examined case of underdetermined nonlinear least squares problems. Finally, we present numerical results for the application of neural networks to the forecasting of stock market indices as well as for the construction of the volatility function in European option pricing models. In case of the latter application, we discretize the parabolic differential equation using a finite difference scheme and we elucidate convergence problems of the discrete scheme when the initial condition is not everywhere differentiable.
This thesis introduces a calibration problem for financial market models based on a Monte Carlo approximation of the option payoff and a discretization of the underlying stochastic differential equation. It is desirable to benefit from fast deterministic optimization methods to solve this problem. To be able to achieve this goal, possible non-differentiabilities are smoothed out with an appropriately chosen twice continuously differentiable polynomial. On the basis of this so derived calibration problem, this work is essentially concerned about two issues. First, the question occurs, if a computed solution of the approximating problem, derived by applying Monte Carlo, discretizing the SDE and preserving differentiability is an approximation of a solution of the true problem. Unfortunately, this does not hold in general but is linked to certain assumptions. It will turn out, that a uniform convergence of the approximated objective function and its gradient to the true objective and gradient can be shown under typical assumptions, for instance the Lipschitz continuity of the SDE coefficients. This uniform convergence then allows to show convergence of the solutions in the sense of a first order critical point. Furthermore, an order of this convergence in relation to the number of simulations, the step size for the SDE discretization and the parameter controlling the smooth approximation of non-differentiabilites will be shown. Additionally the uniqueness of a solution of the stochastic differential equation will be analyzed in detail. Secondly, the Monte Carlo method provides only a very slow convergence. The numerical results in this thesis will show, that the Monte Carlo based calibration indeed is feasible if one is concerned about the calculated solution, but the required calculation time is too long for practical applications. Thus, techniques to speed up the calibration are strongly desired. As already mentioned above, the gradient of the objective is a starting point to improve efficiency. Due to its simplicity, finite differences is a frequently chosen method to calculate the required derivatives. However, finite differences is well known to be very slow and furthermore, it will turn out, that there may also occur severe instabilities during optimization which may lead to the break down of the algorithm before convergence has been reached. In this manner a sensitivity equation is certainly an improvement but suffers unfortunately from the same computational effort as the finite difference method. Thus, an adjoint based gradient calculation will be the method of choice as it combines the exactness of the derivative with a reduced computational effort. Furthermore, several other techniques will be introduced throughout this thesis, that enhance the efficiency of the calibration algorithm. A multi-layer method will be very effective in the case, that the chosen initial value is not already close to the solution. Variance reduction techniques are helpful to increase accuracy of the Monte Carlo estimator and thus allow for fewer simulations. Storing instead of regenerating the random numbers required for the Brownian increments in the SDE will be efficient, as deterministic optimization methods anyway require to employ the identical random sequence in each function evaluation. Finally, Monte Carlo is very well suited for a parallelization, which will be done on several central processing units (CPUs).
In this thesis, we consider the solution of high-dimensional optimization problems with an underlying low-rank tensor structure. Due to the exponentially increasing computational complexity in the number of dimensions—the so-called curse of dimensionality—they present a considerable computational challenge and become infeasible even for moderate problem sizes.
Multilinear algebra and tensor numerical methods have a wide range of applications in the fields of data science and scientific computing. Due to the typically large problem sizes in practical settings, efficient methods, which exploit low-rank structures, are essential. In this thesis, we consider an application each in both of these fields.
Tensor completion, or imputation of unknown values in partially known multiway data is an important problem, which appears in statistics, mathematical imaging science and data science. Under the assumption of redundancy in the underlying data, this is a well-defined problem and methods of mathematical optimization can be applied to it.
Due to the fact that tensors of fixed rank form a Riemannian submanifold of the ambient high-dimensional tensor space, Riemannian optimization is a natural framework for these problems, which is both mathematically rigorous and computationally efficient.
We present a novel Riemannian trust-region scheme, which compares favourably with the state of the art on selected application cases and outperforms known methods on some test problems.
Optimization problems governed by partial differential equations form an area of scientific computing which has applications in a variety of areas, ranging from physics to financial mathematics. Due to the inherent high dimensionality of optimization problems arising from discretized differential equations, these problems present computational challenges, especially in the case of three or more dimensions. An even more challenging class of optimization problems has operators of integral instead of differential type in the constraint. These operators are nonlocal, and therefore lead to large, dense discrete systems of equations. We present a novel solution method, based on separation of spatial dimensions and provably low-rank approximation of the nonlocal operator. Our approach allows the solution of multidimensional problems with a complexity which is only slightly larger than linear in the univariate grid size; this improves the state of the art for a particular test problem problem by at least two orders of magnitude.
Due to the transition towards climate neutrality, energy markets are rapidly evolving. New technologies are developed that allow electricity from renewable energy sources to be stored or to be converted into other energy commodities. As a consequence, new players enter the markets and existing players gain more importance. Market equilibrium problems are capable of capturing these changes and therefore enable us to answer contemporary research questions with regard to energy market design and climate policy.
This cumulative dissertation is devoted to the study of different market equilibrium problems that address such emerging aspects in liberalized energy markets. In the first part, we review a well-studied competitive equilibrium model for energy commodity markets and extend this model by sector coupling, by temporal coupling, and by a more detailed representation of physical laws and technical requirements. Moreover, we summarize our main contributions of the last years with respect to analyzing the market equilibria of the resulting equilibrium problems.
For the extension regarding sector coupling, we derive sufficient conditions for ensuring uniqueness of the short-run equilibrium a priori and for verifying uniqueness of the long-run equilibrium a posteriori. Furthermore, we present illustrative examples that each of the derived conditions is indeed necessary to guarantee uniqueness in general.
For the extension regarding temporal coupling, we provide sufficient conditions for ensuring uniqueness of demand and production a priori. These conditions also imply uniqueness of the short-run equilibrium in case of a single storage operator. However, in case of multiple storage operators, examples illustrate that charging and discharging decisions are not unique in general. We conclude the equilibrium analysis with an a posteriori criterion for verifying uniqueness of a given short-run equilibrium. Since the computation of equilibria is much more challenging due to the temporal coupling, we shortly review why a tailored parallel and distributed alternating direction method of multipliers enables to efficiently compute market equilibria.
For the extension regarding physical laws and technical requirements, we show that, in nonconvex settings, existence of an equilibrium is not guaranteed and that the fundamental welfare theorems therefore fail to hold. In addition, we argue that the welfare theorems can be re-established in a market design in which the system operator is committed to a welfare objective. For the case of a profit-maximizing system operator, we propose an algorithm that indicates existence of an equilibrium and that computes an equilibrium in the case of existence. Based on well-known instances from the literature on the gas and electricity sector, we demonstrate the broad applicability of our algorithm. Our computational results suggest that an equilibrium often exists for an application involving nonconvex but continuous stationary gas physics. In turn, integralities introduced due to the switchability of DC lines in DC electricity networks lead to many instances without an equilibrium. Finally, we state sufficient conditions under which the gas application has a unique equilibrium and the line switching application has finitely many.
In the second part, all preprints belonging to this cumulative dissertation are provided. These preprints, as well as two journal articles to which the author of this thesis contributed, are referenced within the extended summary in the first part and contain more details.