Cargando…

No Longer Confidential: Estimating the Confidence of Individual Regression Predictions

Quantitative predictions in computational life sciences are often based on regression models. The advent of machine learning has led to highly accurate regression models that have gained widespread acceptance. While there are statistical methods available to estimate the global performance of regres...

Descripción completa

Detalles Bibliográficos
Autores principales:	Briesemeister, Sebastian, Rahnenführer, Jörg, Kohlbacher, Oliver
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2012
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3499506/ https://www.ncbi.nlm.nih.gov/pubmed/23166592 http://dx.doi.org/10.1371/journal.pone.0048723

_version_	1782249978360496128
author	Briesemeister, Sebastian Rahnenführer, Jörg Kohlbacher, Oliver
author_facet	Briesemeister, Sebastian Rahnenführer, Jörg Kohlbacher, Oliver
author_sort	Briesemeister, Sebastian
collection	PubMed
description	Quantitative predictions in computational life sciences are often based on regression models. The advent of machine learning has led to highly accurate regression models that have gained widespread acceptance. While there are statistical methods available to estimate the global performance of regression models on a test or training dataset, it is often not clear how well this performance transfers to other datasets or how reliable an individual prediction is–a fact that often reduces a user’s trust into a computational method. In analogy to the concept of an experimental error, we sketch how estimators for individual prediction errors can be used to provide confidence intervals for individual predictions. Two novel statistical methods, named CONFINE and CONFIVE, can estimate the reliability of an individual prediction based on the local properties of nearby training data. The methods can be applied equally to linear and non-linear regression methods with very little computational overhead. We compare our confidence estimators with other existing confidence and applicability domain estimators on two biologically relevant problems (MHC–peptide binding prediction and quantitative structure-activity relationship (QSAR)). Our results suggest that the proposed confidence estimators perform comparable to or better than previously proposed estimation methods. Given a sufficient amount of training data, the estimators exhibit error estimates of high quality. In addition, we observed that the quality of estimated confidence intervals is predictable. We discuss how confidence estimation is influenced by noise, the number of features, and the dataset size. Estimating the confidence in individual prediction in terms of error intervals represents an important step from plain, non-informative predictions towards transparent and interpretable predictions that will help to improve the acceptance of computational methods in the biological community.
format	Online Article Text
id	pubmed-3499506
institution	National Center for Biotechnology Information
language	English
publishDate	2012
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-34995062012-11-19 No Longer Confidential: Estimating the Confidence of Individual Regression Predictions Briesemeister, Sebastian Rahnenführer, Jörg Kohlbacher, Oliver PLoS One Research Article Quantitative predictions in computational life sciences are often based on regression models. The advent of machine learning has led to highly accurate regression models that have gained widespread acceptance. While there are statistical methods available to estimate the global performance of regression models on a test or training dataset, it is often not clear how well this performance transfers to other datasets or how reliable an individual prediction is–a fact that often reduces a user’s trust into a computational method. In analogy to the concept of an experimental error, we sketch how estimators for individual prediction errors can be used to provide confidence intervals for individual predictions. Two novel statistical methods, named CONFINE and CONFIVE, can estimate the reliability of an individual prediction based on the local properties of nearby training data. The methods can be applied equally to linear and non-linear regression methods with very little computational overhead. We compare our confidence estimators with other existing confidence and applicability domain estimators on two biologically relevant problems (MHC–peptide binding prediction and quantitative structure-activity relationship (QSAR)). Our results suggest that the proposed confidence estimators perform comparable to or better than previously proposed estimation methods. Given a sufficient amount of training data, the estimators exhibit error estimates of high quality. In addition, we observed that the quality of estimated confidence intervals is predictable. We discuss how confidence estimation is influenced by noise, the number of features, and the dataset size. Estimating the confidence in individual prediction in terms of error intervals represents an important step from plain, non-informative predictions towards transparent and interpretable predictions that will help to improve the acceptance of computational methods in the biological community. Public Library of Science 2012-11-15 /pmc/articles/PMC3499506/ /pubmed/23166592 http://dx.doi.org/10.1371/journal.pone.0048723 Text en © 2012 Briesemeister et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle	Research Article Briesemeister, Sebastian Rahnenführer, Jörg Kohlbacher, Oliver No Longer Confidential: Estimating the Confidence of Individual Regression Predictions
title	No Longer Confidential: Estimating the Confidence of Individual Regression Predictions
title_full	No Longer Confidential: Estimating the Confidence of Individual Regression Predictions
title_fullStr	No Longer Confidential: Estimating the Confidence of Individual Regression Predictions
title_full_unstemmed	No Longer Confidential: Estimating the Confidence of Individual Regression Predictions
title_short	No Longer Confidential: Estimating the Confidence of Individual Regression Predictions
title_sort	no longer confidential: estimating the confidence of individual regression predictions
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3499506/ https://www.ncbi.nlm.nih.gov/pubmed/23166592 http://dx.doi.org/10.1371/journal.pone.0048723
work_keys_str_mv	AT briesemeistersebastian nolongerconfidentialestimatingtheconfidenceofindividualregressionpredictions AT rahnenfuhrerjorg nolongerconfidentialestimatingtheconfidenceofindividualregressionpredictions AT kohlbacheroliver nolongerconfidentialestimatingtheconfidenceofindividualregressionpredictions

No Longer Confidential: Estimating the Confidence of Individual Regression Predictions

Ejemplares similares