Cargando…

The cost of large numbers of hypothesis tests on power, effect size and sample size

Advances in high-throughput biology and computer science are driving an exponential increase in the number of hypothesis tests in genomics and other scientific disciplines. Studies using current genotyping platforms frequently include a million or more tests. In addition to the monetary cost, this i...

Descripción completa

Detalles Bibliográficos
Autores principales: Lazzeroni, L C, Ray, A
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3252610/
https://www.ncbi.nlm.nih.gov/pubmed/21060308
http://dx.doi.org/10.1038/mp.2010.117
_version_ 1782220649686630400
author Lazzeroni, L C
Ray, A
author_facet Lazzeroni, L C
Ray, A
author_sort Lazzeroni, L C
collection PubMed
description Advances in high-throughput biology and computer science are driving an exponential increase in the number of hypothesis tests in genomics and other scientific disciplines. Studies using current genotyping platforms frequently include a million or more tests. In addition to the monetary cost, this increase imposes a statistical cost owing to the multiple testing corrections needed to avoid large numbers of false-positive results. To safeguard against the resulting loss of power, some have suggested sample sizes on the order of tens of thousands that can be impractical for many diseases or may lower the quality of phenotypic measurements. This study examines the relationship between the number of tests on the one hand and power, detectable effect size or required sample size on the other. We show that once the number of tests is large, power can be maintained at a constant level, with comparatively small increases in the effect size or sample size. For example at the 0.05 significance level, a 13% increase in sample size is needed to maintain 80% power for ten million tests compared with one million tests, whereas a 70% increase in sample size is needed for 10 tests compared with a single test. Relative costs are less when measured by increases in the detectable effect size. We provide an interactive Excel calculator to compute power, effect size or sample size when comparing study designs or genome platforms involving different numbers of hypothesis tests. The results are reassuring in an era of extreme multiple testing.
format Online
Article
Text
id pubmed-3252610
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-32526102012-01-10 The cost of large numbers of hypothesis tests on power, effect size and sample size Lazzeroni, L C Ray, A Mol Psychiatry Original Article Advances in high-throughput biology and computer science are driving an exponential increase in the number of hypothesis tests in genomics and other scientific disciplines. Studies using current genotyping platforms frequently include a million or more tests. In addition to the monetary cost, this increase imposes a statistical cost owing to the multiple testing corrections needed to avoid large numbers of false-positive results. To safeguard against the resulting loss of power, some have suggested sample sizes on the order of tens of thousands that can be impractical for many diseases or may lower the quality of phenotypic measurements. This study examines the relationship between the number of tests on the one hand and power, detectable effect size or required sample size on the other. We show that once the number of tests is large, power can be maintained at a constant level, with comparatively small increases in the effect size or sample size. For example at the 0.05 significance level, a 13% increase in sample size is needed to maintain 80% power for ten million tests compared with one million tests, whereas a 70% increase in sample size is needed for 10 tests compared with a single test. Relative costs are less when measured by increases in the detectable effect size. We provide an interactive Excel calculator to compute power, effect size or sample size when comparing study designs or genome platforms involving different numbers of hypothesis tests. The results are reassuring in an era of extreme multiple testing. Nature Publishing Group 2012-01 2010-11-09 /pmc/articles/PMC3252610/ /pubmed/21060308 http://dx.doi.org/10.1038/mp.2010.117 Text en Copyright © 2012 Macmillan Publishers Limited http://creativecommons.org/licenses/by-nc-nd/3.0/ This work is licensed under the Creative Commons Attribution-NonCommercial-No Derivative Works 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/3.0/
spellingShingle Original Article
Lazzeroni, L C
Ray, A
The cost of large numbers of hypothesis tests on power, effect size and sample size
title The cost of large numbers of hypothesis tests on power, effect size and sample size
title_full The cost of large numbers of hypothesis tests on power, effect size and sample size
title_fullStr The cost of large numbers of hypothesis tests on power, effect size and sample size
title_full_unstemmed The cost of large numbers of hypothesis tests on power, effect size and sample size
title_short The cost of large numbers of hypothesis tests on power, effect size and sample size
title_sort cost of large numbers of hypothesis tests on power, effect size and sample size
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3252610/
https://www.ncbi.nlm.nih.gov/pubmed/21060308
http://dx.doi.org/10.1038/mp.2010.117
work_keys_str_mv AT lazzeronilc thecostoflargenumbersofhypothesistestsonpowereffectsizeandsamplesize
AT raya thecostoflargenumbersofhypothesistestsonpowereffectsizeandsamplesize
AT lazzeronilc costoflargenumbersofhypothesistestsonpowereffectsizeandsamplesize
AT raya costoflargenumbersofhypothesistestsonpowereffectsizeandsamplesize