Cargando…
The cost of large numbers of hypothesis tests on power, effect size and sample size
Advances in high-throughput biology and computer science are driving an exponential increase in the number of hypothesis tests in genomics and other scientific disciplines. Studies using current genotyping platforms frequently include a million or more tests. In addition to the monetary cost, this i...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3252610/ https://www.ncbi.nlm.nih.gov/pubmed/21060308 http://dx.doi.org/10.1038/mp.2010.117 |
_version_ | 1782220649686630400 |
---|---|
author | Lazzeroni, L C Ray, A |
author_facet | Lazzeroni, L C Ray, A |
author_sort | Lazzeroni, L C |
collection | PubMed |
description | Advances in high-throughput biology and computer science are driving an exponential increase in the number of hypothesis tests in genomics and other scientific disciplines. Studies using current genotyping platforms frequently include a million or more tests. In addition to the monetary cost, this increase imposes a statistical cost owing to the multiple testing corrections needed to avoid large numbers of false-positive results. To safeguard against the resulting loss of power, some have suggested sample sizes on the order of tens of thousands that can be impractical for many diseases or may lower the quality of phenotypic measurements. This study examines the relationship between the number of tests on the one hand and power, detectable effect size or required sample size on the other. We show that once the number of tests is large, power can be maintained at a constant level, with comparatively small increases in the effect size or sample size. For example at the 0.05 significance level, a 13% increase in sample size is needed to maintain 80% power for ten million tests compared with one million tests, whereas a 70% increase in sample size is needed for 10 tests compared with a single test. Relative costs are less when measured by increases in the detectable effect size. We provide an interactive Excel calculator to compute power, effect size or sample size when comparing study designs or genome platforms involving different numbers of hypothesis tests. The results are reassuring in an era of extreme multiple testing. |
format | Online Article Text |
id | pubmed-3252610 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | Nature Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-32526102012-01-10 The cost of large numbers of hypothesis tests on power, effect size and sample size Lazzeroni, L C Ray, A Mol Psychiatry Original Article Advances in high-throughput biology and computer science are driving an exponential increase in the number of hypothesis tests in genomics and other scientific disciplines. Studies using current genotyping platforms frequently include a million or more tests. In addition to the monetary cost, this increase imposes a statistical cost owing to the multiple testing corrections needed to avoid large numbers of false-positive results. To safeguard against the resulting loss of power, some have suggested sample sizes on the order of tens of thousands that can be impractical for many diseases or may lower the quality of phenotypic measurements. This study examines the relationship between the number of tests on the one hand and power, detectable effect size or required sample size on the other. We show that once the number of tests is large, power can be maintained at a constant level, with comparatively small increases in the effect size or sample size. For example at the 0.05 significance level, a 13% increase in sample size is needed to maintain 80% power for ten million tests compared with one million tests, whereas a 70% increase in sample size is needed for 10 tests compared with a single test. Relative costs are less when measured by increases in the detectable effect size. We provide an interactive Excel calculator to compute power, effect size or sample size when comparing study designs or genome platforms involving different numbers of hypothesis tests. The results are reassuring in an era of extreme multiple testing. Nature Publishing Group 2012-01 2010-11-09 /pmc/articles/PMC3252610/ /pubmed/21060308 http://dx.doi.org/10.1038/mp.2010.117 Text en Copyright © 2012 Macmillan Publishers Limited http://creativecommons.org/licenses/by-nc-nd/3.0/ This work is licensed under the Creative Commons Attribution-NonCommercial-No Derivative Works 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/3.0/ |
spellingShingle | Original Article Lazzeroni, L C Ray, A The cost of large numbers of hypothesis tests on power, effect size and sample size |
title | The cost of large numbers of hypothesis tests on power, effect size and sample size |
title_full | The cost of large numbers of hypothesis tests on power, effect size and sample size |
title_fullStr | The cost of large numbers of hypothesis tests on power, effect size and sample size |
title_full_unstemmed | The cost of large numbers of hypothesis tests on power, effect size and sample size |
title_short | The cost of large numbers of hypothesis tests on power, effect size and sample size |
title_sort | cost of large numbers of hypothesis tests on power, effect size and sample size |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3252610/ https://www.ncbi.nlm.nih.gov/pubmed/21060308 http://dx.doi.org/10.1038/mp.2010.117 |
work_keys_str_mv | AT lazzeronilc thecostoflargenumbersofhypothesistestsonpowereffectsizeandsamplesize AT raya thecostoflargenumbersofhypothesistestsonpowereffectsizeandsamplesize AT lazzeronilc costoflargenumbersofhypothesistestsonpowereffectsizeandsamplesize AT raya costoflargenumbersofhypothesistestsonpowereffectsizeandsamplesize |