Cargando…
Replicability analysis in genome-wide association studies via Cartesian hidden Markov models
BACKGROUND: Replicability analysis which aims to detect replicated signals attracts more and more attentions in modern scientific applications. For example, in genome-wide association studies (GWAS), it would be of convincing to detect an association which can be replicated in more than one study. S...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6423849/ https://www.ncbi.nlm.nih.gov/pubmed/30885122 http://dx.doi.org/10.1186/s12859-019-2707-7 |
_version_ | 1783404600245616640 |
---|---|
author | Wang, Pengfei Zhu, Wensheng |
author_facet | Wang, Pengfei Zhu, Wensheng |
author_sort | Wang, Pengfei |
collection | PubMed |
description | BACKGROUND: Replicability analysis which aims to detect replicated signals attracts more and more attentions in modern scientific applications. For example, in genome-wide association studies (GWAS), it would be of convincing to detect an association which can be replicated in more than one study. Since the neighboring single nucleotide polymorphisms (SNPs) often exhibit high correlation, it is desirable to exploit the dependency information among adjacent SNPs properly in replicability analysis. In this paper, we propose a novel multiple testing procedure based on the Cartesian hidden Markov model (CHMM), called repLIS procedure, for replicability analysis across two studies, which can characterize the local dependence structure among adjacent SNPs via a four-state Markov chain. RESULTS: Theoretical results show that the repLIS procedure can control the false discovery rate (FDR) at the nominal level α and is shown to be optimal in the sense that it has the smallest false non-discovery rate (FNR) among all α-level multiple testing procedures. We carry out simulation studies to compare our repLIS procedure with the existing methods, including the Benjamini-Hochberg (BH) procedure and the empirical Bayes approach, called repfdr. Finally, we apply our repLIS procedure and repfdr procedure in the replicability analyses of psychiatric disorders data sets collected by Psychiatric Genomics Consortium (PGC) and Wellcome Trust Case Control Consortium (WTCCC). Both the simulation studies and real data analysis show that the repLIS procedure is valid and achieves a higher efficiency compared with its competitors. CONCLUSIONS: In replicability analysis, our repLIS procedure controls the FDR at the pre-specified level α and can achieve more efficiency by exploiting the dependency information among adjacent SNPs. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2707-7) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-6423849 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-64238492019-03-28 Replicability analysis in genome-wide association studies via Cartesian hidden Markov models Wang, Pengfei Zhu, Wensheng BMC Bioinformatics Methodology Article BACKGROUND: Replicability analysis which aims to detect replicated signals attracts more and more attentions in modern scientific applications. For example, in genome-wide association studies (GWAS), it would be of convincing to detect an association which can be replicated in more than one study. Since the neighboring single nucleotide polymorphisms (SNPs) often exhibit high correlation, it is desirable to exploit the dependency information among adjacent SNPs properly in replicability analysis. In this paper, we propose a novel multiple testing procedure based on the Cartesian hidden Markov model (CHMM), called repLIS procedure, for replicability analysis across two studies, which can characterize the local dependence structure among adjacent SNPs via a four-state Markov chain. RESULTS: Theoretical results show that the repLIS procedure can control the false discovery rate (FDR) at the nominal level α and is shown to be optimal in the sense that it has the smallest false non-discovery rate (FNR) among all α-level multiple testing procedures. We carry out simulation studies to compare our repLIS procedure with the existing methods, including the Benjamini-Hochberg (BH) procedure and the empirical Bayes approach, called repfdr. Finally, we apply our repLIS procedure and repfdr procedure in the replicability analyses of psychiatric disorders data sets collected by Psychiatric Genomics Consortium (PGC) and Wellcome Trust Case Control Consortium (WTCCC). Both the simulation studies and real data analysis show that the repLIS procedure is valid and achieves a higher efficiency compared with its competitors. CONCLUSIONS: In replicability analysis, our repLIS procedure controls the FDR at the pre-specified level α and can achieve more efficiency by exploiting the dependency information among adjacent SNPs. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2707-7) contains supplementary material, which is available to authorized users. BioMed Central 2019-03-18 /pmc/articles/PMC6423849/ /pubmed/30885122 http://dx.doi.org/10.1186/s12859-019-2707-7 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Wang, Pengfei Zhu, Wensheng Replicability analysis in genome-wide association studies via Cartesian hidden Markov models |
title | Replicability analysis in genome-wide association studies via Cartesian hidden Markov models |
title_full | Replicability analysis in genome-wide association studies via Cartesian hidden Markov models |
title_fullStr | Replicability analysis in genome-wide association studies via Cartesian hidden Markov models |
title_full_unstemmed | Replicability analysis in genome-wide association studies via Cartesian hidden Markov models |
title_short | Replicability analysis in genome-wide association studies via Cartesian hidden Markov models |
title_sort | replicability analysis in genome-wide association studies via cartesian hidden markov models |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6423849/ https://www.ncbi.nlm.nih.gov/pubmed/30885122 http://dx.doi.org/10.1186/s12859-019-2707-7 |
work_keys_str_mv | AT wangpengfei replicabilityanalysisingenomewideassociationstudiesviacartesianhiddenmarkovmodels AT zhuwensheng replicabilityanalysisingenomewideassociationstudiesviacartesianhiddenmarkovmodels |