Filtern
Dokumenttyp
- Dissertation (4) (entfernen)
Volltext vorhanden
- ja (4) (entfernen)
Schlagworte
- Maschinelles Lernen (4) (entfernen)
Institut
- Fachbereich 4 (2)
- Fachbereich 1 (1)
There is no longer any doubt about the general effectiveness of psychotherapy. However, up to 40% of patients do not respond to treatment. Despite efforts to develop new treatments, overall effectiveness has not improved. Consequently, practice-oriented research has emerged to make research results more relevant to practitioners. Within this context, patient-focused research (PFR) focuses on the question of whether a particular treatment works for a specific patient. Finally, PFR gave rise to the precision mental health research movement that is trying to tailor treatments to individual patients by making data-driven and algorithm-based predictions. These predictions are intended to support therapists in their clinical decisions, such as the selection of treatment strategies and adaptation of treatment. The present work summarizes three studies that aim to generate different prediction models for treatment personalization that can be applied to practice. The goal of Study I was to develop a model for dropout prediction using data assessed prior to the first session (N = 2543). The usefulness of various machine learning (ML) algorithms and ensembles was assessed. The best model was an ensemble utilizing random forest and nearest neighbor modeling. It significantly outperformed generalized linear modeling, correctly identifying 63.4% of all cases and uncovering seven key predictors. The findings illustrated the potential of ML to enhance dropout predictions, but also highlighted that not all ML algorithms are equally suitable for this purpose. Study II utilized Study I’s findings to enhance the prediction of dropout rates. Data from the initial two sessions and observer ratings of therapist interventions and skills were employed to develop a model using an elastic net (EN) algorithm. The findings demonstrated that the model was significantly more effective at predicting dropout when using observer ratings with a Cohen’s d of up to .65 and more effective than the model in Study I, despite the smaller sample (N = 259). These results indicated that generating models could be improved by employing various data sources, which provide better foundations for model development. Finally, Study III generated a model to predict therapy outcome after a sudden gain (SG) in order to identify crucial predictors of the upward spiral. EN was used to generate the model using data from 794 cases that experienced a SG. A control group of the same size was also used to quantify and relativize the identified predictors by their general influence on therapy outcomes. The results indicated that there are seven key predictors that have varying effect sizes on therapy outcome, with Cohen's d ranging from 1.08 to 12.48. The findings suggested that a directive approach is more likely to lead to better outcomes after an SG, and that alliance ruptures can be effectively compensated for. However, these effects
were reversed in the control group. The results of the three studies are discussed regarding their usefulness to support clinical decision-making and their implications for the implementation of precision mental health.
Data used for the purpose of machine learning are often erroneous. In this thesis, p-quasinorms (p<1) are employed as loss functions in order to increase the robustness of training algorithms for artificial neural networks. Numerical issues arising from these loss functions are addressed via enhanced optimization algorithms (proximal point methods; Frank-Wolfe methods) based on the (non-monotonic) Armijo-rule. Numerical experiments comprising 1100 test problems confirm the effectiveness of the approach. Depending on the parametrization, an average reduction of the absolute residuals of up to 64.6% is achieved (aggregated over 100 test problems).
In dem Gebiet der Informationsextraktion angesiedelt kombiniert diese Arbeit mehrere Verfahren aus dem Bereich des maschinellen Lernens. Sie stellt einen neuen Algorithmus vor, der teil-überwachtes Lernen mit aktivem Lernen verknüpft. Ausgangsbasis ist die Analyse der Daten, indem sie in mehrere Sichten aufgeteilt werden. Hier werden die Eingaben verschiedener Personen unterteilt. Jeweils getrennt voneinander erzeugt der Algorithmus mittels Klassifizierern Modelle, die aus den individuellen Auszeichnungen der Personen aufgebaut werden. Um die dafür benötigte Datenmenge zu erhalten wird Crowdsourcing genutzt, dass es ermöglicht eine große Anzahl an Personen zu erreichen. Die Personen erhalten die Aufgabe, Texte zu annotieren. Einerseits wird dies initial für einen historischen Textkorpus vorgenommen. Dabei wird aufgeführt, welche Schritte notwendig sind, um die Annotationsaufgabe in Crowdsourcing-Portalen zur Bearbeitung anzubieten und durchzuführen. Andererseits wird ein aktueller Datensatz von Kurznachrichten genutzt. Der Algorithmus wird auf diese Beispieldatensätze angewandt. Durch Experimente wird die Ermittlung der optimalen Parameterauswahl durchgeführt. Außerdem werden die Ergebnisse mit den Resultaten bisheriger Algorithmen verglichen.
We consider a linear regression model for which we assume that some of the observed variables are irrelevant for the prediction. Including the wrong variables in the statistical model can either lead to the problem of having too little information to properly estimate the statistic of interest, or having too much information and consequently describing fictitious connections. This thesis considers discrete optimization to conduct a variable selection. In light of this, the subset selection regression method is analyzed. The approach gained a lot of interest in recent years due to its promising predictive performance. A major challenge associated with the subset selection regression is the computational difficulty. In this thesis, we propose several improvements for the efficiency of the method. Novel bounds on the coefficients of the subset selection regression are developed, which help to tighten the relaxation of the associated mixed-integer program, which relies on a Big-M formulation. Moreover, a novel mixed-integer linear formulation for the subset selection regression based on a bilevel optimization reformulation is proposed. Finally, it is shown that the perspective formulation of the subset selection regression is equivalent to a state-of-the-art binary formulation. We use this insight to develop novel bounds for the subset selection regression problem, which show to be highly effective in combination with the proposed linear formulation.
In the second part of this thesis, we examine the statistical conception of the subset selection regression and conclude that it is misaligned with its intention. The subset selection regression uses the training error to decide on which variables to select. The approach conducts the validation on the training data, which oftentimes is not a good estimate of the prediction error. Hence, it requires a predetermined cardinality bound. Instead, we propose to select variables with respect to the cross-validation value. The process is formulated as a mixed-integer program with the sparsity becoming subject of the optimization. Usually, a cross-validation is used to select the best model out of a few options. With the proposed program the best model out of all possible models is selected. Since the cross-validation is a much better estimate of the prediction error, the model can select the best sparsity itself.
The thesis is concluded with an extensive simulation study which provides evidence that discrete optimization can be used to produce highly valuable predictive models with the cross-validation subset selection regression almost always producing the best results.