Cargando…

Assessing the accuracy of predictive models for numerical data: Not r nor r(2), why not? Then what?

Assessing the accuracy of predictive models is critical because predictive models have been increasingly used across various disciplines and predictive accuracy determines the quality of resultant predictions. Pearson product-moment correlation coefficient (r) and the coefficient of determination (r...

Descripción completa

Detalles Bibliográficos
Autor principal:	Li, Jin
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2017
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5570302/ https://www.ncbi.nlm.nih.gov/pubmed/28837692 http://dx.doi.org/10.1371/journal.pone.0183250

_version_	1783259154789433344
author	Li, Jin
author_facet	Li, Jin
author_sort	Li, Jin
collection	PubMed
description	Assessing the accuracy of predictive models is critical because predictive models have been increasingly used across various disciplines and predictive accuracy determines the quality of resultant predictions. Pearson product-moment correlation coefficient (r) and the coefficient of determination (r(2)) are among the most widely used measures for assessing predictive models for numerical data, although they are argued to be biased, insufficient and misleading. In this study, geometrical graphs were used to illustrate what were used in the calculation of r and r(2) and simulations were used to demonstrate the behaviour of r and r(2) and to compare three accuracy measures under various scenarios. Relevant confusions about r and r(2), has been clarified. The calculation of r and r(2) is not based on the differences between the predicted and observed values. The existing error measures suffer various limitations and are unable to tell the accuracy. Variance explained by predictive models based on cross-validation (VEcv) is free of these limitations and is a reliable accuracy measure. Legates and McCabe’s efficiency (E(1)) is also an alternative accuracy measure. The r and r(2) do not measure the accuracy and are incorrect accuracy measures. The existing error measures suffer limitations. VEcv and E(1) are recommended for assessing the accuracy. The applications of these accuracy measures would encourage accuracy-improved predictive models to be developed to generate predictions for evidence-informed decision-making.
format	Online Article Text
id	pubmed-5570302
institution	National Center for Biotechnology Information
language	English
publishDate	2017
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-55703022017-09-09 Assessing the accuracy of predictive models for numerical data: Not r nor r(2), why not? Then what? Li, Jin PLoS One Research Article Assessing the accuracy of predictive models is critical because predictive models have been increasingly used across various disciplines and predictive accuracy determines the quality of resultant predictions. Pearson product-moment correlation coefficient (r) and the coefficient of determination (r(2)) are among the most widely used measures for assessing predictive models for numerical data, although they are argued to be biased, insufficient and misleading. In this study, geometrical graphs were used to illustrate what were used in the calculation of r and r(2) and simulations were used to demonstrate the behaviour of r and r(2) and to compare three accuracy measures under various scenarios. Relevant confusions about r and r(2), has been clarified. The calculation of r and r(2) is not based on the differences between the predicted and observed values. The existing error measures suffer various limitations and are unable to tell the accuracy. Variance explained by predictive models based on cross-validation (VEcv) is free of these limitations and is a reliable accuracy measure. Legates and McCabe’s efficiency (E(1)) is also an alternative accuracy measure. The r and r(2) do not measure the accuracy and are incorrect accuracy measures. The existing error measures suffer limitations. VEcv and E(1) are recommended for assessing the accuracy. The applications of these accuracy measures would encourage accuracy-improved predictive models to be developed to generate predictions for evidence-informed decision-making. Public Library of Science 2017-08-24 /pmc/articles/PMC5570302/ /pubmed/28837692 http://dx.doi.org/10.1371/journal.pone.0183250 Text en © 2017 Jin Li http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Research Article Li, Jin Assessing the accuracy of predictive models for numerical data: Not r nor r(2), why not? Then what?
title	Assessing the accuracy of predictive models for numerical data: Not r nor r(2), why not? Then what?
title_full	Assessing the accuracy of predictive models for numerical data: Not r nor r(2), why not? Then what?
title_fullStr	Assessing the accuracy of predictive models for numerical data: Not r nor r(2), why not? Then what?
title_full_unstemmed	Assessing the accuracy of predictive models for numerical data: Not r nor r(2), why not? Then what?
title_short	Assessing the accuracy of predictive models for numerical data: Not r nor r(2), why not? Then what?
title_sort	assessing the accuracy of predictive models for numerical data: not r nor r(2), why not? then what?
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5570302/ https://www.ncbi.nlm.nih.gov/pubmed/28837692 http://dx.doi.org/10.1371/journal.pone.0183250
work_keys_str_mv	AT lijin assessingtheaccuracyofpredictivemodelsfornumericaldatanotrnorr2whynotthenwhat

Assessing the accuracy of predictive models for numerical data: Not r nor r(2), why not? Then what?

Ejemplares similares