Cargando…

Replicability analysis in genome-wide association studies via Cartesian hidden Markov models

BACKGROUND: Replicability analysis which aims to detect replicated signals attracts more and more attentions in modern scientific applications. For example, in genome-wide association studies (GWAS), it would be of convincing to detect an association which can be replicated in more than one study. S...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Pengfei, Zhu, Wensheng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6423849/
https://www.ncbi.nlm.nih.gov/pubmed/30885122
http://dx.doi.org/10.1186/s12859-019-2707-7
_version_ 1783404600245616640
author Wang, Pengfei
Zhu, Wensheng
author_facet Wang, Pengfei
Zhu, Wensheng
author_sort Wang, Pengfei
collection PubMed
description BACKGROUND: Replicability analysis which aims to detect replicated signals attracts more and more attentions in modern scientific applications. For example, in genome-wide association studies (GWAS), it would be of convincing to detect an association which can be replicated in more than one study. Since the neighboring single nucleotide polymorphisms (SNPs) often exhibit high correlation, it is desirable to exploit the dependency information among adjacent SNPs properly in replicability analysis. In this paper, we propose a novel multiple testing procedure based on the Cartesian hidden Markov model (CHMM), called repLIS procedure, for replicability analysis across two studies, which can characterize the local dependence structure among adjacent SNPs via a four-state Markov chain. RESULTS: Theoretical results show that the repLIS procedure can control the false discovery rate (FDR) at the nominal level α and is shown to be optimal in the sense that it has the smallest false non-discovery rate (FNR) among all α-level multiple testing procedures. We carry out simulation studies to compare our repLIS procedure with the existing methods, including the Benjamini-Hochberg (BH) procedure and the empirical Bayes approach, called repfdr. Finally, we apply our repLIS procedure and repfdr procedure in the replicability analyses of psychiatric disorders data sets collected by Psychiatric Genomics Consortium (PGC) and Wellcome Trust Case Control Consortium (WTCCC). Both the simulation studies and real data analysis show that the repLIS procedure is valid and achieves a higher efficiency compared with its competitors. CONCLUSIONS: In replicability analysis, our repLIS procedure controls the FDR at the pre-specified level α and can achieve more efficiency by exploiting the dependency information among adjacent SNPs. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2707-7) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6423849
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-64238492019-03-28 Replicability analysis in genome-wide association studies via Cartesian hidden Markov models Wang, Pengfei Zhu, Wensheng BMC Bioinformatics Methodology Article BACKGROUND: Replicability analysis which aims to detect replicated signals attracts more and more attentions in modern scientific applications. For example, in genome-wide association studies (GWAS), it would be of convincing to detect an association which can be replicated in more than one study. Since the neighboring single nucleotide polymorphisms (SNPs) often exhibit high correlation, it is desirable to exploit the dependency information among adjacent SNPs properly in replicability analysis. In this paper, we propose a novel multiple testing procedure based on the Cartesian hidden Markov model (CHMM), called repLIS procedure, for replicability analysis across two studies, which can characterize the local dependence structure among adjacent SNPs via a four-state Markov chain. RESULTS: Theoretical results show that the repLIS procedure can control the false discovery rate (FDR) at the nominal level α and is shown to be optimal in the sense that it has the smallest false non-discovery rate (FNR) among all α-level multiple testing procedures. We carry out simulation studies to compare our repLIS procedure with the existing methods, including the Benjamini-Hochberg (BH) procedure and the empirical Bayes approach, called repfdr. Finally, we apply our repLIS procedure and repfdr procedure in the replicability analyses of psychiatric disorders data sets collected by Psychiatric Genomics Consortium (PGC) and Wellcome Trust Case Control Consortium (WTCCC). Both the simulation studies and real data analysis show that the repLIS procedure is valid and achieves a higher efficiency compared with its competitors. CONCLUSIONS: In replicability analysis, our repLIS procedure controls the FDR at the pre-specified level α and can achieve more efficiency by exploiting the dependency information among adjacent SNPs. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2707-7) contains supplementary material, which is available to authorized users. BioMed Central 2019-03-18 /pmc/articles/PMC6423849/ /pubmed/30885122 http://dx.doi.org/10.1186/s12859-019-2707-7 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Wang, Pengfei
Zhu, Wensheng
Replicability analysis in genome-wide association studies via Cartesian hidden Markov models
title Replicability analysis in genome-wide association studies via Cartesian hidden Markov models
title_full Replicability analysis in genome-wide association studies via Cartesian hidden Markov models
title_fullStr Replicability analysis in genome-wide association studies via Cartesian hidden Markov models
title_full_unstemmed Replicability analysis in genome-wide association studies via Cartesian hidden Markov models
title_short Replicability analysis in genome-wide association studies via Cartesian hidden Markov models
title_sort replicability analysis in genome-wide association studies via cartesian hidden markov models
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6423849/
https://www.ncbi.nlm.nih.gov/pubmed/30885122
http://dx.doi.org/10.1186/s12859-019-2707-7
work_keys_str_mv AT wangpengfei replicabilityanalysisingenomewideassociationstudiesviacartesianhiddenmarkovmodels
AT zhuwensheng replicabilityanalysisingenomewideassociationstudiesviacartesianhiddenmarkovmodels