Cargando…

Robust Multiple Regression

As modern data analysis pushes the boundaries of classical statistics, it is timely to reexamine alternate approaches to dealing with outliers in multiple regression. As sample sizes and the number of predictors increase, interactive methodology becomes less effective. Likewise, with limited underst...

Descripción completa

Detalles Bibliográficos
Autores principales: Scott, David W., Wang, Zhipeng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7826993/
https://www.ncbi.nlm.nih.gov/pubmed/33435467
http://dx.doi.org/10.3390/e23010088
_version_ 1783640653577584640
author Scott, David W.
Wang, Zhipeng
author_facet Scott, David W.
Wang, Zhipeng
author_sort Scott, David W.
collection PubMed
description As modern data analysis pushes the boundaries of classical statistics, it is timely to reexamine alternate approaches to dealing with outliers in multiple regression. As sample sizes and the number of predictors increase, interactive methodology becomes less effective. Likewise, with limited understanding of the underlying contamination process, diagnostics are likely to fail as well. In this article, we advocate for a non-likelihood procedure that attempts to quantify the fraction of bad data as a part of the estimation step. These ideas also allow for the selection of important predictors under some assumptions. As there are many robust algorithms available, running several and looking for interesting differences is a sensible strategy for understanding the nature of the outliers.
format Online
Article
Text
id pubmed-7826993
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-78269932021-02-24 Robust Multiple Regression Scott, David W. Wang, Zhipeng Entropy (Basel) Article As modern data analysis pushes the boundaries of classical statistics, it is timely to reexamine alternate approaches to dealing with outliers in multiple regression. As sample sizes and the number of predictors increase, interactive methodology becomes less effective. Likewise, with limited understanding of the underlying contamination process, diagnostics are likely to fail as well. In this article, we advocate for a non-likelihood procedure that attempts to quantify the fraction of bad data as a part of the estimation step. These ideas also allow for the selection of important predictors under some assumptions. As there are many robust algorithms available, running several and looking for interesting differences is a sensible strategy for understanding the nature of the outliers. MDPI 2021-01-09 /pmc/articles/PMC7826993/ /pubmed/33435467 http://dx.doi.org/10.3390/e23010088 Text en © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Scott, David W.
Wang, Zhipeng
Robust Multiple Regression
title Robust Multiple Regression
title_full Robust Multiple Regression
title_fullStr Robust Multiple Regression
title_full_unstemmed Robust Multiple Regression
title_short Robust Multiple Regression
title_sort robust multiple regression
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7826993/
https://www.ncbi.nlm.nih.gov/pubmed/33435467
http://dx.doi.org/10.3390/e23010088
work_keys_str_mv AT scottdavidw robustmultipleregression
AT wangzhipeng robustmultipleregression