Cargando…
P-values in genomics: Apparent precision masks high uncertainty
Scientists often interpret P-values as measures of the relative strength of statistical findings. This is common practice in large-scale genomic studies where P-values are used to choose which of numerous hypothesis test results should be pursued in subsequent research. In this study, we examine P-v...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4255087/ https://www.ncbi.nlm.nih.gov/pubmed/24419042 http://dx.doi.org/10.1038/mp.2013.184 |
_version_ | 1782347391829016576 |
---|---|
author | Lazzeroni, L C Lu, Y Belitskaya-Lévy, I |
author_facet | Lazzeroni, L C Lu, Y Belitskaya-Lévy, I |
author_sort | Lazzeroni, L C |
collection | PubMed |
description | Scientists often interpret P-values as measures of the relative strength of statistical findings. This is common practice in large-scale genomic studies where P-values are used to choose which of numerous hypothesis test results should be pursued in subsequent research. In this study, we examine P-value variability to assess the degree of certainty P-values provide. We develop prediction intervals for the P-value in a replication study given the P-value observed in an initial study. The intervals depend on the initial value of P and the ratio of sample sizes between the initial and replication studies, but not on the underlying effect size or initial sample size. The intervals are valid for most large-sample statistical tests in any context, and can be used in the presence of single or multiple tests. While P-values are highly variable, future P-value variability can be explicitly predicted based on a P-value from an initial study. The relative size of the replication and initial study is an important predictor of the P-value in a subsequent replication study. We provide a handy calculator implementing these results and apply them to a study of Alzheimer's disease and recent findings of the Cross-Disorder Group of the Psychiatric Genomics Consortium. This study suggests that overinterpretation of very significant, but highly variable, P-values is an important factor contributing to the unexpectedly high incidence of non-replication. Formal prediction intervals can also provide realistic interpretations and comparisons of P-values associated with different estimated effect sizes and sample sizes. |
format | Online Article Text |
id | pubmed-4255087 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Nature Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-42550872014-12-11 P-values in genomics: Apparent precision masks high uncertainty Lazzeroni, L C Lu, Y Belitskaya-Lévy, I Mol Psychiatry Original Article Scientists often interpret P-values as measures of the relative strength of statistical findings. This is common practice in large-scale genomic studies where P-values are used to choose which of numerous hypothesis test results should be pursued in subsequent research. In this study, we examine P-value variability to assess the degree of certainty P-values provide. We develop prediction intervals for the P-value in a replication study given the P-value observed in an initial study. The intervals depend on the initial value of P and the ratio of sample sizes between the initial and replication studies, but not on the underlying effect size or initial sample size. The intervals are valid for most large-sample statistical tests in any context, and can be used in the presence of single or multiple tests. While P-values are highly variable, future P-value variability can be explicitly predicted based on a P-value from an initial study. The relative size of the replication and initial study is an important predictor of the P-value in a subsequent replication study. We provide a handy calculator implementing these results and apply them to a study of Alzheimer's disease and recent findings of the Cross-Disorder Group of the Psychiatric Genomics Consortium. This study suggests that overinterpretation of very significant, but highly variable, P-values is an important factor contributing to the unexpectedly high incidence of non-replication. Formal prediction intervals can also provide realistic interpretations and comparisons of P-values associated with different estimated effect sizes and sample sizes. Nature Publishing Group 2014-12 2014-01-14 /pmc/articles/PMC4255087/ /pubmed/24419042 http://dx.doi.org/10.1038/mp.2013.184 Text en Copyright © 2014 Macmillan Publishers Limited http://creativecommons.org/licenses/by-nc-nd/3.0/ This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/3.0/ |
spellingShingle | Original Article Lazzeroni, L C Lu, Y Belitskaya-Lévy, I P-values in genomics: Apparent precision masks high uncertainty |
title | P-values in genomics: Apparent precision masks high uncertainty |
title_full | P-values in genomics: Apparent precision masks high uncertainty |
title_fullStr | P-values in genomics: Apparent precision masks high uncertainty |
title_full_unstemmed | P-values in genomics: Apparent precision masks high uncertainty |
title_short | P-values in genomics: Apparent precision masks high uncertainty |
title_sort | p-values in genomics: apparent precision masks high uncertainty |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4255087/ https://www.ncbi.nlm.nih.gov/pubmed/24419042 http://dx.doi.org/10.1038/mp.2013.184 |
work_keys_str_mv | AT lazzeronilc pvaluesingenomicsapparentprecisionmaskshighuncertainty AT luy pvaluesingenomicsapparentprecisionmaskshighuncertainty AT belitskayalevyi pvaluesingenomicsapparentprecisionmaskshighuncertainty |