Cargando…
Large-scale multiple testing in genome-wide association studies via region-specific hidden Markov models
BACKGROUND: Identifying genetic variants associated with complex human diseases is a great challenge in genome-wide association studies (GWAS). Single nucleotide polymorphisms (SNPs) arising from genetic background are often dependent. The existing methods, i.e., local index of significance (LIS) an...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3850654/ https://www.ncbi.nlm.nih.gov/pubmed/24067069 http://dx.doi.org/10.1186/1471-2105-14-282 |
_version_ | 1782294137131761664 |
---|---|
author | Xiao, Jian Zhu, Wensheng Guo, Jianhua |
author_facet | Xiao, Jian Zhu, Wensheng Guo, Jianhua |
author_sort | Xiao, Jian |
collection | PubMed |
description | BACKGROUND: Identifying genetic variants associated with complex human diseases is a great challenge in genome-wide association studies (GWAS). Single nucleotide polymorphisms (SNPs) arising from genetic background are often dependent. The existing methods, i.e., local index of significance (LIS) and pooled local index of significance (PLIS), were both proposed for modeling SNP dependence and assumed that the whole chromosome follows a hidden Markov model (HMM). However, the fact that SNP data are often collected from separate heterogeneous regions of a single chromosome encourages different chromosomal regions to follow different HMMs. In this research, we developed a data-driven penalized criterion combined with a dynamic programming algorithm to find change points that divide the whole chromosome into more homogeneous regions. Furthermore, we extended PLIS to analyze the dependent tests obtained from multiple chromosomes with different regions for GWAS. RESULTS: The simulation results show that our new criterion can improve the performance of the model selection procedure and that our region-specific PLIS (RSPLIS) method is better than PLIS at detecting disease-associated SNPs when there are multiple change points along a chromosome. Our method has been used to analyze the Daly study, and compared with PLIS, RSPLIS yielded results that more accurately detected disease-associated SNPs. CONCLUSIONS: The genomic rankings based on our method differ from the rankings based on PLIS. Specifically, for the detection of genetic variants with weak effect sizes, the RSPLIS method was able to rank them more efficiently and with greater power. |
format | Online Article Text |
id | pubmed-3850654 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-38506542013-12-16 Large-scale multiple testing in genome-wide association studies via region-specific hidden Markov models Xiao, Jian Zhu, Wensheng Guo, Jianhua BMC Bioinformatics Methodology Article BACKGROUND: Identifying genetic variants associated with complex human diseases is a great challenge in genome-wide association studies (GWAS). Single nucleotide polymorphisms (SNPs) arising from genetic background are often dependent. The existing methods, i.e., local index of significance (LIS) and pooled local index of significance (PLIS), were both proposed for modeling SNP dependence and assumed that the whole chromosome follows a hidden Markov model (HMM). However, the fact that SNP data are often collected from separate heterogeneous regions of a single chromosome encourages different chromosomal regions to follow different HMMs. In this research, we developed a data-driven penalized criterion combined with a dynamic programming algorithm to find change points that divide the whole chromosome into more homogeneous regions. Furthermore, we extended PLIS to analyze the dependent tests obtained from multiple chromosomes with different regions for GWAS. RESULTS: The simulation results show that our new criterion can improve the performance of the model selection procedure and that our region-specific PLIS (RSPLIS) method is better than PLIS at detecting disease-associated SNPs when there are multiple change points along a chromosome. Our method has been used to analyze the Daly study, and compared with PLIS, RSPLIS yielded results that more accurately detected disease-associated SNPs. CONCLUSIONS: The genomic rankings based on our method differ from the rankings based on PLIS. Specifically, for the detection of genetic variants with weak effect sizes, the RSPLIS method was able to rank them more efficiently and with greater power. BioMed Central 2013-09-25 /pmc/articles/PMC3850654/ /pubmed/24067069 http://dx.doi.org/10.1186/1471-2105-14-282 Text en Copyright © 2013 Xiao et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methodology Article Xiao, Jian Zhu, Wensheng Guo, Jianhua Large-scale multiple testing in genome-wide association studies via region-specific hidden Markov models |
title | Large-scale multiple testing in genome-wide association studies via region-specific hidden Markov models |
title_full | Large-scale multiple testing in genome-wide association studies via region-specific hidden Markov models |
title_fullStr | Large-scale multiple testing in genome-wide association studies via region-specific hidden Markov models |
title_full_unstemmed | Large-scale multiple testing in genome-wide association studies via region-specific hidden Markov models |
title_short | Large-scale multiple testing in genome-wide association studies via region-specific hidden Markov models |
title_sort | large-scale multiple testing in genome-wide association studies via region-specific hidden markov models |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3850654/ https://www.ncbi.nlm.nih.gov/pubmed/24067069 http://dx.doi.org/10.1186/1471-2105-14-282 |
work_keys_str_mv | AT xiaojian largescalemultipletestingingenomewideassociationstudiesviaregionspecifichiddenmarkovmodels AT zhuwensheng largescalemultipletestingingenomewideassociationstudiesviaregionspecifichiddenmarkovmodels AT guojianhua largescalemultipletestingingenomewideassociationstudiesviaregionspecifichiddenmarkovmodels |