Cargando…
Analysis validation has been neglected in the Age of Reproducibility
Increasingly complex statistical models are being used for the analysis of biological data. Recent commentary has focused on the ability to compute the same outcome for a given dataset (reproducibility). We argue that a reproducible statistical analysis is not necessarily valid because of unique pat...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6301703/ https://www.ncbi.nlm.nih.gov/pubmed/30532167 http://dx.doi.org/10.1371/journal.pbio.3000070 |
_version_ | 1783381846784999424 |
---|---|
author | Lotterhos, Kathleen E. Moore, Jason H. Stapleton, Ann E. |
author_facet | Lotterhos, Kathleen E. Moore, Jason H. Stapleton, Ann E. |
author_sort | Lotterhos, Kathleen E. |
collection | PubMed |
description | Increasingly complex statistical models are being used for the analysis of biological data. Recent commentary has focused on the ability to compute the same outcome for a given dataset (reproducibility). We argue that a reproducible statistical analysis is not necessarily valid because of unique patterns of nonindependence in every biological dataset. We advocate that analyses should be evaluated with known-truth simulations that capture biological reality, a process we call “analysis validation.” We review the process of validation and suggest criteria that a validation project should meet. We find that different fields of science have historically failed to meet all criteria, and we suggest ways to implement meaningful validation in training and practice. |
format | Online Article Text |
id | pubmed-6301703 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-63017032019-01-08 Analysis validation has been neglected in the Age of Reproducibility Lotterhos, Kathleen E. Moore, Jason H. Stapleton, Ann E. PLoS Biol Essay Increasingly complex statistical models are being used for the analysis of biological data. Recent commentary has focused on the ability to compute the same outcome for a given dataset (reproducibility). We argue that a reproducible statistical analysis is not necessarily valid because of unique patterns of nonindependence in every biological dataset. We advocate that analyses should be evaluated with known-truth simulations that capture biological reality, a process we call “analysis validation.” We review the process of validation and suggest criteria that a validation project should meet. We find that different fields of science have historically failed to meet all criteria, and we suggest ways to implement meaningful validation in training and practice. Public Library of Science 2018-12-10 /pmc/articles/PMC6301703/ /pubmed/30532167 http://dx.doi.org/10.1371/journal.pbio.3000070 Text en © 2018 Lotterhos et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Essay Lotterhos, Kathleen E. Moore, Jason H. Stapleton, Ann E. Analysis validation has been neglected in the Age of Reproducibility |
title | Analysis validation has been neglected in the Age of Reproducibility |
title_full | Analysis validation has been neglected in the Age of Reproducibility |
title_fullStr | Analysis validation has been neglected in the Age of Reproducibility |
title_full_unstemmed | Analysis validation has been neglected in the Age of Reproducibility |
title_short | Analysis validation has been neglected in the Age of Reproducibility |
title_sort | analysis validation has been neglected in the age of reproducibility |
topic | Essay |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6301703/ https://www.ncbi.nlm.nih.gov/pubmed/30532167 http://dx.doi.org/10.1371/journal.pbio.3000070 |
work_keys_str_mv | AT lotterhoskathleene analysisvalidationhasbeenneglectedintheageofreproducibility AT moorejasonh analysisvalidationhasbeenneglectedintheageofreproducibility AT stapletonanne analysisvalidationhasbeenneglectedintheageofreproducibility |