Cargando…

Control procedures and estimators of the false discovery rate and their application in low-dimensional settings: an empirical investigation

BACKGROUND: When many (up to millions) of statistical tests are conducted in discovery set analyses such as genome-wide association studies (GWAS), approaches controlling family-wise error rate (FWER) or false discovery rate (FDR) are required to reduce the number of false positive decisions. Some m...

Descripción completa

Detalles Bibliográficos
Autores principales: Brinster, Regina, Köttgen, Anna, Tayo, Bamidele O., Schumacher, Martin, Sekula, Peggy
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5833079/
https://www.ncbi.nlm.nih.gov/pubmed/29499647
http://dx.doi.org/10.1186/s12859-018-2081-x
_version_ 1783303421327048704
author Brinster, Regina
Köttgen, Anna
Tayo, Bamidele O.
Schumacher, Martin
Sekula, Peggy
author_facet Brinster, Regina
Köttgen, Anna
Tayo, Bamidele O.
Schumacher, Martin
Sekula, Peggy
author_sort Brinster, Regina
collection PubMed
description BACKGROUND: When many (up to millions) of statistical tests are conducted in discovery set analyses such as genome-wide association studies (GWAS), approaches controlling family-wise error rate (FWER) or false discovery rate (FDR) are required to reduce the number of false positive decisions. Some methods were specifically developed in the context of high-dimensional settings and partially rely on the estimation of the proportion of true null hypotheses. However, these approaches are also applied in low-dimensional settings such as replication set analyses that might be restricted to a small number of specific hypotheses. The aim of this study was to compare different approaches in low-dimensional settings using (a) real data from the CKDGen Consortium and (b) a simulation study. RESULTS: In both application and simulation FWER approaches were less powerful compared to FDR control methods, whether a larger number of hypotheses were tested or not. Most powerful was the q-value method. However, the specificity of this method to maintain true null hypotheses was especially decreased when the number of tested hypotheses was small. In this low-dimensional situation, estimation of the proportion of true null hypotheses was biased. CONCLUSIONS: The results highlight the importance of a sizeable data set for a reliable estimation of the proportion of true null hypotheses. Consequently, methods relying on this estimation should only be applied in high-dimensional settings. Furthermore, if the focus lies on testing of a small number of hypotheses such as in replication settings, FWER methods rather than FDR methods should be preferred to maintain high specificity. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2081-x) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5833079
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-58330792018-03-05 Control procedures and estimators of the false discovery rate and their application in low-dimensional settings: an empirical investigation Brinster, Regina Köttgen, Anna Tayo, Bamidele O. Schumacher, Martin Sekula, Peggy BMC Bioinformatics Research Article BACKGROUND: When many (up to millions) of statistical tests are conducted in discovery set analyses such as genome-wide association studies (GWAS), approaches controlling family-wise error rate (FWER) or false discovery rate (FDR) are required to reduce the number of false positive decisions. Some methods were specifically developed in the context of high-dimensional settings and partially rely on the estimation of the proportion of true null hypotheses. However, these approaches are also applied in low-dimensional settings such as replication set analyses that might be restricted to a small number of specific hypotheses. The aim of this study was to compare different approaches in low-dimensional settings using (a) real data from the CKDGen Consortium and (b) a simulation study. RESULTS: In both application and simulation FWER approaches were less powerful compared to FDR control methods, whether a larger number of hypotheses were tested or not. Most powerful was the q-value method. However, the specificity of this method to maintain true null hypotheses was especially decreased when the number of tested hypotheses was small. In this low-dimensional situation, estimation of the proportion of true null hypotheses was biased. CONCLUSIONS: The results highlight the importance of a sizeable data set for a reliable estimation of the proportion of true null hypotheses. Consequently, methods relying on this estimation should only be applied in high-dimensional settings. Furthermore, if the focus lies on testing of a small number of hypotheses such as in replication settings, FWER methods rather than FDR methods should be preferred to maintain high specificity. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2081-x) contains supplementary material, which is available to authorized users. BioMed Central 2018-03-02 /pmc/articles/PMC5833079/ /pubmed/29499647 http://dx.doi.org/10.1186/s12859-018-2081-x Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Brinster, Regina
Köttgen, Anna
Tayo, Bamidele O.
Schumacher, Martin
Sekula, Peggy
Control procedures and estimators of the false discovery rate and their application in low-dimensional settings: an empirical investigation
title Control procedures and estimators of the false discovery rate and their application in low-dimensional settings: an empirical investigation
title_full Control procedures and estimators of the false discovery rate and their application in low-dimensional settings: an empirical investigation
title_fullStr Control procedures and estimators of the false discovery rate and their application in low-dimensional settings: an empirical investigation
title_full_unstemmed Control procedures and estimators of the false discovery rate and their application in low-dimensional settings: an empirical investigation
title_short Control procedures and estimators of the false discovery rate and their application in low-dimensional settings: an empirical investigation
title_sort control procedures and estimators of the false discovery rate and their application in low-dimensional settings: an empirical investigation
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5833079/
https://www.ncbi.nlm.nih.gov/pubmed/29499647
http://dx.doi.org/10.1186/s12859-018-2081-x
work_keys_str_mv AT brinsterregina controlproceduresandestimatorsofthefalsediscoveryrateandtheirapplicationinlowdimensionalsettingsanempiricalinvestigation
AT kottgenanna controlproceduresandestimatorsofthefalsediscoveryrateandtheirapplicationinlowdimensionalsettingsanempiricalinvestigation
AT tayobamideleo controlproceduresandestimatorsofthefalsediscoveryrateandtheirapplicationinlowdimensionalsettingsanempiricalinvestigation
AT schumachermartin controlproceduresandestimatorsofthefalsediscoveryrateandtheirapplicationinlowdimensionalsettingsanempiricalinvestigation
AT sekulapeggy controlproceduresandestimatorsofthefalsediscoveryrateandtheirapplicationinlowdimensionalsettingsanempiricalinvestigation
AT controlproceduresandestimatorsofthefalsediscoveryrateandtheirapplicationinlowdimensionalsettingsanempiricalinvestigation