Cargando…

A Robust Learning Approach for Regression Models Based on Distributionally Robust Optimization

We present a Distributionally Robust Optimization (DRO) approach to estimate a robustified regression plane in a linear regression setting, when the observed samples are potentially contaminated with adversarially corrupted outliers. Our approach mitigates the impact of outliers by hedging against a...

Descripción completa

Detalles Bibliográficos
Autores principales:	Chen, Ruidi, Paschalidis, Ioannis Ch.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	2018
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8378760/ https://www.ncbi.nlm.nih.gov/pubmed/34421397

_version_	1783740875434622976
author	Chen, Ruidi Paschalidis, Ioannis Ch.
author_facet	Chen, Ruidi Paschalidis, Ioannis Ch.
author_sort	Chen, Ruidi
collection	PubMed
description	We present a Distributionally Robust Optimization (DRO) approach to estimate a robustified regression plane in a linear regression setting, when the observed samples are potentially contaminated with adversarially corrupted outliers. Our approach mitigates the impact of outliers by hedging against a family of probability distributions on the observed data, some of which assign very low probabilities to the outliers. The set of distributions under consideration are close to the empirical distribution in the sense of the Wasserstein metric. We show that this DRO formulation can be relaxed to a convex optimization problem which encompasses a class of models. By selecting proper norm spaces for the Wasserstein metric, we are able to recover several commonly used regularized regression models. We provide new insights into the regularization term and give guidance on the selection of the regularization coefficient from the standpoint of a confidence region. We establish two types of performance guarantees for the solution to our formulation under mild conditions. One is related to its out-of-sample behavior (prediction bias), and the other concerns the discrepancy between the estimated and true regression planes (estimation bias). Extensive numerical results demonstrate the superiority of our approach to a host of regression models, in terms of the prediction and estimation accuracies. We also consider the application of our robust learning procedure to outlier detection, and show that our approach achieves a much higher AUC (Area Under the ROC Curve) than M-estimation (Huber, 1964, 1973).
format	Online Article Text
id	pubmed-8378760
institution	National Center for Biotechnology Information
language	English
publishDate	2018
record_format	MEDLINE/PubMed
spelling	pubmed-83787602021-08-20 A Robust Learning Approach for Regression Models Based on Distributionally Robust Optimization Chen, Ruidi Paschalidis, Ioannis Ch. J Mach Learn Res Article We present a Distributionally Robust Optimization (DRO) approach to estimate a robustified regression plane in a linear regression setting, when the observed samples are potentially contaminated with adversarially corrupted outliers. Our approach mitigates the impact of outliers by hedging against a family of probability distributions on the observed data, some of which assign very low probabilities to the outliers. The set of distributions under consideration are close to the empirical distribution in the sense of the Wasserstein metric. We show that this DRO formulation can be relaxed to a convex optimization problem which encompasses a class of models. By selecting proper norm spaces for the Wasserstein metric, we are able to recover several commonly used regularized regression models. We provide new insights into the regularization term and give guidance on the selection of the regularization coefficient from the standpoint of a confidence region. We establish two types of performance guarantees for the solution to our formulation under mild conditions. One is related to its out-of-sample behavior (prediction bias), and the other concerns the discrepancy between the estimated and true regression planes (estimation bias). Extensive numerical results demonstrate the superiority of our approach to a host of regression models, in terms of the prediction and estimation accuracies. We also consider the application of our robust learning procedure to outlier detection, and show that our approach achieves a much higher AUC (Area Under the ROC Curve) than M-estimation (Huber, 1964, 1973). 2018-01 2018-01-01 /pmc/articles/PMC8378760/ /pubmed/34421397 Text en https://creativecommons.org/licenses/by/4.0/License: CC-BY 4.0, see https://creativecommons.org/licenses/by/4.0/.
spellingShingle	Article Chen, Ruidi Paschalidis, Ioannis Ch. A Robust Learning Approach for Regression Models Based on Distributionally Robust Optimization
title	A Robust Learning Approach for Regression Models Based on Distributionally Robust Optimization
title_full	A Robust Learning Approach for Regression Models Based on Distributionally Robust Optimization
title_fullStr	A Robust Learning Approach for Regression Models Based on Distributionally Robust Optimization
title_full_unstemmed	A Robust Learning Approach for Regression Models Based on Distributionally Robust Optimization
title_short	A Robust Learning Approach for Regression Models Based on Distributionally Robust Optimization
title_sort	robust learning approach for regression models based on distributionally robust optimization
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8378760/ https://www.ncbi.nlm.nih.gov/pubmed/34421397
work_keys_str_mv	AT chenruidi arobustlearningapproachforregressionmodelsbasedondistributionallyrobustoptimization AT paschalidisioannisch arobustlearningapproachforregressionmodelsbasedondistributionallyrobustoptimization AT chenruidi robustlearningapproachforregressionmodelsbasedondistributionallyrobustoptimization AT paschalidisioannisch robustlearningapproachforregressionmodelsbasedondistributionallyrobustoptimization

A Robust Learning Approach for Regression Models Based on Distributionally Robust Optimization

Ejemplares similares