Cargando…

A novel approach to minimize false discovery rate in genome-wide data analysis

BACKGROUND: High-throughput technologies, such as DNA microarray, have significantly advanced biological and biomedical research by enabling researchers to carry out genome-wide screens. One critical task in analyzing genome-wide datasets is to control the false discovery rate (FDR) so that the prop...

Descripción completa

Detalles Bibliográficos
Autores principales: Bei, Yuanzhe, Hong, Pengyu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3856609/
https://www.ncbi.nlm.nih.gov/pubmed/24564975
http://dx.doi.org/10.1186/1752-0509-7-S4-S1
_version_ 1782295088980819968
author Bei, Yuanzhe
Hong, Pengyu
author_facet Bei, Yuanzhe
Hong, Pengyu
author_sort Bei, Yuanzhe
collection PubMed
description BACKGROUND: High-throughput technologies, such as DNA microarray, have significantly advanced biological and biomedical research by enabling researchers to carry out genome-wide screens. One critical task in analyzing genome-wide datasets is to control the false discovery rate (FDR) so that the proportion of false positive features among those called significant is restrained. Recently a number of FDR control methods have been proposed and widely practiced, such as the Benjamini-Hochberg approach, the Storey approach and Significant Analysis of Microarrays (SAM). METHODS: This paper presents a straight-forward yet powerful FDR control method termed miFDR, which aims to minimize FDR when calling a fixed number of significant features. We theoretically proved that the strategy used by miFDR is able to find the optimal number of significant features when the desired FDR is fixed. RESULTS: We compared miFDR with the BH approach, the Storey approach and SAM on both simulated datasets and public DNA microarray datasets. The results demonstrated that miFDR outperforms others by identifying more significant features under the same FDR cut-offs. Literature search showed that many genes called only by miFDR are indeed relevant to the underlying biology of interest. CONCLUSIONS: FDR has been widely applied to analyzing high-throughput datasets allowed for rapid discoveries. Under the same FDR threshold, miFDR is capable to identify more significant features than its competitors at a compatible level of complexity. Therefore, it can potentially generate great impacts on biological and biomedical research. AVAILABILITY: If interested, please contact the authors for getting miFDR.
format Online
Article
Text
id pubmed-3856609
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-38566092013-12-16 A novel approach to minimize false discovery rate in genome-wide data analysis Bei, Yuanzhe Hong, Pengyu BMC Syst Biol Research BACKGROUND: High-throughput technologies, such as DNA microarray, have significantly advanced biological and biomedical research by enabling researchers to carry out genome-wide screens. One critical task in analyzing genome-wide datasets is to control the false discovery rate (FDR) so that the proportion of false positive features among those called significant is restrained. Recently a number of FDR control methods have been proposed and widely practiced, such as the Benjamini-Hochberg approach, the Storey approach and Significant Analysis of Microarrays (SAM). METHODS: This paper presents a straight-forward yet powerful FDR control method termed miFDR, which aims to minimize FDR when calling a fixed number of significant features. We theoretically proved that the strategy used by miFDR is able to find the optimal number of significant features when the desired FDR is fixed. RESULTS: We compared miFDR with the BH approach, the Storey approach and SAM on both simulated datasets and public DNA microarray datasets. The results demonstrated that miFDR outperforms others by identifying more significant features under the same FDR cut-offs. Literature search showed that many genes called only by miFDR are indeed relevant to the underlying biology of interest. CONCLUSIONS: FDR has been widely applied to analyzing high-throughput datasets allowed for rapid discoveries. Under the same FDR threshold, miFDR is capable to identify more significant features than its competitors at a compatible level of complexity. Therefore, it can potentially generate great impacts on biological and biomedical research. AVAILABILITY: If interested, please contact the authors for getting miFDR. BioMed Central 2013-10-23 /pmc/articles/PMC3856609/ /pubmed/24564975 http://dx.doi.org/10.1186/1752-0509-7-S4-S1 Text en Copyright © 2013 Bei and Hong; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Bei, Yuanzhe
Hong, Pengyu
A novel approach to minimize false discovery rate in genome-wide data analysis
title A novel approach to minimize false discovery rate in genome-wide data analysis
title_full A novel approach to minimize false discovery rate in genome-wide data analysis
title_fullStr A novel approach to minimize false discovery rate in genome-wide data analysis
title_full_unstemmed A novel approach to minimize false discovery rate in genome-wide data analysis
title_short A novel approach to minimize false discovery rate in genome-wide data analysis
title_sort novel approach to minimize false discovery rate in genome-wide data analysis
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3856609/
https://www.ncbi.nlm.nih.gov/pubmed/24564975
http://dx.doi.org/10.1186/1752-0509-7-S4-S1
work_keys_str_mv AT beiyuanzhe anovelapproachtominimizefalsediscoveryrateingenomewidedataanalysis
AT hongpengyu anovelapproachtominimizefalsediscoveryrateingenomewidedataanalysis
AT beiyuanzhe novelapproachtominimizefalsediscoveryrateingenomewidedataanalysis
AT hongpengyu novelapproachtominimizefalsediscoveryrateingenomewidedataanalysis