Robust Supervised Learning in Multiple Environments

University essay from Uppsala universitet/Institutionen för informationsteknologi

Author: Xinran Wei; [2023]

Keywords: ;

Abstract: In supervised machine learning, it is common practice to choose a loss function for learning predictive models, such as linear regression models and nonlinear neural networks. The primary objective is to attain accurate predictions. However, this can become increasingly challenging when dealing with heterogeneous data emanating from multiple distributions, given the possibility of varying relationships between independent variables and the outcome across different domains. Therefore, in this study, we introduce a robust linear predictor, named GIM (Gradient Invariant Method), designed to discern invariant linear relationships between covariates and the outcome variable, subsequently enabling stable performance across observed and yet unseen environments. We evaluate the stability of GIM on both synthetic and real-world data, comparing its performance with the standard method of empirical risk minimization (ERM). The empirical results turn out that GIM outperforms ERM in most scenarios.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)