Cargando…

Large-scale multiple testing in genome-wide association studies via region-specific hidden Markov models

BACKGROUND: Identifying genetic variants associated with complex human diseases is a great challenge in genome-wide association studies (GWAS). Single nucleotide polymorphisms (SNPs) arising from genetic background are often dependent. The existing methods, i.e., local index of significance (LIS) an...

Descripción completa

Detalles Bibliográficos
Autores principales: Xiao, Jian, Zhu, Wensheng, Guo, Jianhua
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3850654/
https://www.ncbi.nlm.nih.gov/pubmed/24067069
http://dx.doi.org/10.1186/1471-2105-14-282
_version_ 1782294137131761664
author Xiao, Jian
Zhu, Wensheng
Guo, Jianhua
author_facet Xiao, Jian
Zhu, Wensheng
Guo, Jianhua
author_sort Xiao, Jian
collection PubMed
description BACKGROUND: Identifying genetic variants associated with complex human diseases is a great challenge in genome-wide association studies (GWAS). Single nucleotide polymorphisms (SNPs) arising from genetic background are often dependent. The existing methods, i.e., local index of significance (LIS) and pooled local index of significance (PLIS), were both proposed for modeling SNP dependence and assumed that the whole chromosome follows a hidden Markov model (HMM). However, the fact that SNP data are often collected from separate heterogeneous regions of a single chromosome encourages different chromosomal regions to follow different HMMs. In this research, we developed a data-driven penalized criterion combined with a dynamic programming algorithm to find change points that divide the whole chromosome into more homogeneous regions. Furthermore, we extended PLIS to analyze the dependent tests obtained from multiple chromosomes with different regions for GWAS. RESULTS: The simulation results show that our new criterion can improve the performance of the model selection procedure and that our region-specific PLIS (RSPLIS) method is better than PLIS at detecting disease-associated SNPs when there are multiple change points along a chromosome. Our method has been used to analyze the Daly study, and compared with PLIS, RSPLIS yielded results that more accurately detected disease-associated SNPs. CONCLUSIONS: The genomic rankings based on our method differ from the rankings based on PLIS. Specifically, for the detection of genetic variants with weak effect sizes, the RSPLIS method was able to rank them more efficiently and with greater power.
format Online
Article
Text
id pubmed-3850654
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-38506542013-12-16 Large-scale multiple testing in genome-wide association studies via region-specific hidden Markov models Xiao, Jian Zhu, Wensheng Guo, Jianhua BMC Bioinformatics Methodology Article BACKGROUND: Identifying genetic variants associated with complex human diseases is a great challenge in genome-wide association studies (GWAS). Single nucleotide polymorphisms (SNPs) arising from genetic background are often dependent. The existing methods, i.e., local index of significance (LIS) and pooled local index of significance (PLIS), were both proposed for modeling SNP dependence and assumed that the whole chromosome follows a hidden Markov model (HMM). However, the fact that SNP data are often collected from separate heterogeneous regions of a single chromosome encourages different chromosomal regions to follow different HMMs. In this research, we developed a data-driven penalized criterion combined with a dynamic programming algorithm to find change points that divide the whole chromosome into more homogeneous regions. Furthermore, we extended PLIS to analyze the dependent tests obtained from multiple chromosomes with different regions for GWAS. RESULTS: The simulation results show that our new criterion can improve the performance of the model selection procedure and that our region-specific PLIS (RSPLIS) method is better than PLIS at detecting disease-associated SNPs when there are multiple change points along a chromosome. Our method has been used to analyze the Daly study, and compared with PLIS, RSPLIS yielded results that more accurately detected disease-associated SNPs. CONCLUSIONS: The genomic rankings based on our method differ from the rankings based on PLIS. Specifically, for the detection of genetic variants with weak effect sizes, the RSPLIS method was able to rank them more efficiently and with greater power. BioMed Central 2013-09-25 /pmc/articles/PMC3850654/ /pubmed/24067069 http://dx.doi.org/10.1186/1471-2105-14-282 Text en Copyright © 2013 Xiao et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Xiao, Jian
Zhu, Wensheng
Guo, Jianhua
Large-scale multiple testing in genome-wide association studies via region-specific hidden Markov models
title Large-scale multiple testing in genome-wide association studies via region-specific hidden Markov models
title_full Large-scale multiple testing in genome-wide association studies via region-specific hidden Markov models
title_fullStr Large-scale multiple testing in genome-wide association studies via region-specific hidden Markov models
title_full_unstemmed Large-scale multiple testing in genome-wide association studies via region-specific hidden Markov models
title_short Large-scale multiple testing in genome-wide association studies via region-specific hidden Markov models
title_sort large-scale multiple testing in genome-wide association studies via region-specific hidden markov models
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3850654/
https://www.ncbi.nlm.nih.gov/pubmed/24067069
http://dx.doi.org/10.1186/1471-2105-14-282
work_keys_str_mv AT xiaojian largescalemultipletestingingenomewideassociationstudiesviaregionspecifichiddenmarkovmodels
AT zhuwensheng largescalemultipletestingingenomewideassociationstudiesviaregionspecifichiddenmarkovmodels
AT guojianhua largescalemultipletestingingenomewideassociationstudiesviaregionspecifichiddenmarkovmodels