Cargando…

Extracting replicable associations across multiple studies: Empirical Bayes algorithms for controlling the false discovery rate

In almost every field in genomics, large-scale biomedical datasets are used to report associations. Extracting associations that recur across multiple studies while controlling the false discovery rate is a fundamental challenge. Here, we propose a new method to allow joint analysis of multiple stud...

Descripción completa

Detalles Bibliográficos
Autores principales: Amar, David, Shamir, Ron, Yekutieli, Daniel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5576761/
https://www.ncbi.nlm.nih.gov/pubmed/28821015
http://dx.doi.org/10.1371/journal.pcbi.1005700
_version_ 1783260247447568384
author Amar, David
Shamir, Ron
Yekutieli, Daniel
author_facet Amar, David
Shamir, Ron
Yekutieli, Daniel
author_sort Amar, David
collection PubMed
description In almost every field in genomics, large-scale biomedical datasets are used to report associations. Extracting associations that recur across multiple studies while controlling the false discovery rate is a fundamental challenge. Here, we propose a new method to allow joint analysis of multiple studies. Given a set of p-values obtained from each study, the goal is to identify associations that recur in at least k > 1 studies while controlling the false discovery rate. We propose several new algorithms that differ in how the study dependencies are modeled, and compare them and extant methods under various simulated scenarios. The top algorithm, SCREEN (Scalable Cluster-based REplicability ENhancement), is our new algorithm that works in three stages: (1) clustering an estimated correlation network of the studies, (2) learning replicability (e.g., of genes) within clusters, and (3) merging the results across the clusters. When we applied SCREEN to two real datasets it greatly outperformed the results obtained via standard meta-analysis. First, on a collection of 29 case-control gene expression cancer studies, we detected a large set of consistently up-regulated genes related to proliferation and cell cycle regulation. These genes are both consistently up-regulated across many cancer studies, and are well connected in known gene networks. Second, on a recent pan-cancer study that examined the expression profiles of patients with and without mutations in the HLA complex, we detected a large active module of up-regulated genes that are both related to immune responses and are well connected in known gene networks. This module covers thrice more genes as compared to the original study at a similar false discovery rate, demonstrating the high power of SCREEN. An implementation of SCREEN is available in the supplement.
format Online
Article
Text
id pubmed-5576761
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-55767612017-09-15 Extracting replicable associations across multiple studies: Empirical Bayes algorithms for controlling the false discovery rate Amar, David Shamir, Ron Yekutieli, Daniel PLoS Comput Biol Research Article In almost every field in genomics, large-scale biomedical datasets are used to report associations. Extracting associations that recur across multiple studies while controlling the false discovery rate is a fundamental challenge. Here, we propose a new method to allow joint analysis of multiple studies. Given a set of p-values obtained from each study, the goal is to identify associations that recur in at least k > 1 studies while controlling the false discovery rate. We propose several new algorithms that differ in how the study dependencies are modeled, and compare them and extant methods under various simulated scenarios. The top algorithm, SCREEN (Scalable Cluster-based REplicability ENhancement), is our new algorithm that works in three stages: (1) clustering an estimated correlation network of the studies, (2) learning replicability (e.g., of genes) within clusters, and (3) merging the results across the clusters. When we applied SCREEN to two real datasets it greatly outperformed the results obtained via standard meta-analysis. First, on a collection of 29 case-control gene expression cancer studies, we detected a large set of consistently up-regulated genes related to proliferation and cell cycle regulation. These genes are both consistently up-regulated across many cancer studies, and are well connected in known gene networks. Second, on a recent pan-cancer study that examined the expression profiles of patients with and without mutations in the HLA complex, we detected a large active module of up-regulated genes that are both related to immune responses and are well connected in known gene networks. This module covers thrice more genes as compared to the original study at a similar false discovery rate, demonstrating the high power of SCREEN. An implementation of SCREEN is available in the supplement. Public Library of Science 2017-08-18 /pmc/articles/PMC5576761/ /pubmed/28821015 http://dx.doi.org/10.1371/journal.pcbi.1005700 Text en © 2017 Amar et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Amar, David
Shamir, Ron
Yekutieli, Daniel
Extracting replicable associations across multiple studies: Empirical Bayes algorithms for controlling the false discovery rate
title Extracting replicable associations across multiple studies: Empirical Bayes algorithms for controlling the false discovery rate
title_full Extracting replicable associations across multiple studies: Empirical Bayes algorithms for controlling the false discovery rate
title_fullStr Extracting replicable associations across multiple studies: Empirical Bayes algorithms for controlling the false discovery rate
title_full_unstemmed Extracting replicable associations across multiple studies: Empirical Bayes algorithms for controlling the false discovery rate
title_short Extracting replicable associations across multiple studies: Empirical Bayes algorithms for controlling the false discovery rate
title_sort extracting replicable associations across multiple studies: empirical bayes algorithms for controlling the false discovery rate
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5576761/
https://www.ncbi.nlm.nih.gov/pubmed/28821015
http://dx.doi.org/10.1371/journal.pcbi.1005700
work_keys_str_mv AT amardavid extractingreplicableassociationsacrossmultiplestudiesempiricalbayesalgorithmsforcontrollingthefalsediscoveryrate
AT shamirron extractingreplicableassociationsacrossmultiplestudiesempiricalbayesalgorithmsforcontrollingthefalsediscoveryrate
AT yekutielidaniel extractingreplicableassociationsacrossmultiplestudiesempiricalbayesalgorithmsforcontrollingthefalsediscoveryrate