Cargando…

Large-scale multiple testing in genome-wide association studies via region-specific hidden Markov models

BACKGROUND: Identifying genetic variants associated with complex human diseases is a great challenge in genome-wide association studies (GWAS). Single nucleotide polymorphisms (SNPs) arising from genetic background are often dependent. The existing methods, i.e., local index of significance (LIS) an...

Descripción completa

Detalles Bibliográficos
Autores principales:	Xiao, Jian, Zhu, Wensheng, Guo, Jianhua
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2013
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3850654/ https://www.ncbi.nlm.nih.gov/pubmed/24067069 http://dx.doi.org/10.1186/1471-2105-14-282

_version_	1782294137131761664
author	Xiao, Jian Zhu, Wensheng Guo, Jianhua
author_facet	Xiao, Jian Zhu, Wensheng Guo, Jianhua
author_sort	Xiao, Jian
collection	PubMed
description	BACKGROUND: Identifying genetic variants associated with complex human diseases is a great challenge in genome-wide association studies (GWAS). Single nucleotide polymorphisms (SNPs) arising from genetic background are often dependent. The existing methods, i.e., local index of significance (LIS) and pooled local index of significance (PLIS), were both proposed for modeling SNP dependence and assumed that the whole chromosome follows a hidden Markov model (HMM). However, the fact that SNP data are often collected from separate heterogeneous regions of a single chromosome encourages different chromosomal regions to follow different HMMs. In this research, we developed a data-driven penalized criterion combined with a dynamic programming algorithm to find change points that divide the whole chromosome into more homogeneous regions. Furthermore, we extended PLIS to analyze the dependent tests obtained from multiple chromosomes with different regions for GWAS. RESULTS: The simulation results show that our new criterion can improve the performance of the model selection procedure and that our region-specific PLIS (RSPLIS) method is better than PLIS at detecting disease-associated SNPs when there are multiple change points along a chromosome. Our method has been used to analyze the Daly study, and compared with PLIS, RSPLIS yielded results that more accurately detected disease-associated SNPs. CONCLUSIONS: The genomic rankings based on our method differ from the rankings based on PLIS. Specifically, for the detection of genetic variants with weak effect sizes, the RSPLIS method was able to rank them more efficiently and with greater power.
format	Online Article Text
id	pubmed-3850654
institution	National Center for Biotechnology Information
language	English
publishDate	2013
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-38506542013-12-16 Large-scale multiple testing in genome-wide association studies via region-specific hidden Markov models Xiao, Jian Zhu, Wensheng Guo, Jianhua BMC Bioinformatics Methodology Article BACKGROUND: Identifying genetic variants associated with complex human diseases is a great challenge in genome-wide association studies (GWAS). Single nucleotide polymorphisms (SNPs) arising from genetic background are often dependent. The existing methods, i.e., local index of significance (LIS) and pooled local index of significance (PLIS), were both proposed for modeling SNP dependence and assumed that the whole chromosome follows a hidden Markov model (HMM). However, the fact that SNP data are often collected from separate heterogeneous regions of a single chromosome encourages different chromosomal regions to follow different HMMs. In this research, we developed a data-driven penalized criterion combined with a dynamic programming algorithm to find change points that divide the whole chromosome into more homogeneous regions. Furthermore, we extended PLIS to analyze the dependent tests obtained from multiple chromosomes with different regions for GWAS. RESULTS: The simulation results show that our new criterion can improve the performance of the model selection procedure and that our region-specific PLIS (RSPLIS) method is better than PLIS at detecting disease-associated SNPs when there are multiple change points along a chromosome. Our method has been used to analyze the Daly study, and compared with PLIS, RSPLIS yielded results that more accurately detected disease-associated SNPs. CONCLUSIONS: The genomic rankings based on our method differ from the rankings based on PLIS. Specifically, for the detection of genetic variants with weak effect sizes, the RSPLIS method was able to rank them more efficiently and with greater power. BioMed Central 2013-09-25 /pmc/articles/PMC3850654/ /pubmed/24067069 http://dx.doi.org/10.1186/1471-2105-14-282 Text en Copyright © 2013 Xiao et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Methodology Article Xiao, Jian Zhu, Wensheng Guo, Jianhua Large-scale multiple testing in genome-wide association studies via region-specific hidden Markov models
title	Large-scale multiple testing in genome-wide association studies via region-specific hidden Markov models
title_full	Large-scale multiple testing in genome-wide association studies via region-specific hidden Markov models
title_fullStr	Large-scale multiple testing in genome-wide association studies via region-specific hidden Markov models
title_full_unstemmed	Large-scale multiple testing in genome-wide association studies via region-specific hidden Markov models
title_short	Large-scale multiple testing in genome-wide association studies via region-specific hidden Markov models
title_sort	large-scale multiple testing in genome-wide association studies via region-specific hidden markov models
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3850654/ https://www.ncbi.nlm.nih.gov/pubmed/24067069 http://dx.doi.org/10.1186/1471-2105-14-282
work_keys_str_mv	AT xiaojian largescalemultipletestingingenomewideassociationstudiesviaregionspecifichiddenmarkovmodels AT zhuwensheng largescalemultipletestingingenomewideassociationstudiesviaregionspecifichiddenmarkovmodels AT guojianhua largescalemultipletestingingenomewideassociationstudiesviaregionspecifichiddenmarkovmodels

Large-scale multiple testing in genome-wide association studies via region-specific hidden Markov models

Ejemplares similares