Cargando…

Association Studies with Imputed Variants Using Expectation-Maximization Likelihood-Ratio Tests

Genotype imputation has become standard practice in modern genetic studies. As sequencing-based reference panels continue to grow, increasingly more markers are being well or better imputed but at the same time, even more markers with relatively low minor allele frequency are being imputed with low...

Descripción completa

Detalles Bibliográficos
Autores principales: Huang, Kuan-Chieh, Sun, Wei, Wu, Ying, Chen, Mengjie, Mohlke, Karen L., Lange, Leslie A., Li, Yun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4226494/
https://www.ncbi.nlm.nih.gov/pubmed/25383782
http://dx.doi.org/10.1371/journal.pone.0110679
_version_ 1782343630058422272
author Huang, Kuan-Chieh
Sun, Wei
Wu, Ying
Chen, Mengjie
Mohlke, Karen L.
Lange, Leslie A.
Li, Yun
author_facet Huang, Kuan-Chieh
Sun, Wei
Wu, Ying
Chen, Mengjie
Mohlke, Karen L.
Lange, Leslie A.
Li, Yun
author_sort Huang, Kuan-Chieh
collection PubMed
description Genotype imputation has become standard practice in modern genetic studies. As sequencing-based reference panels continue to grow, increasingly more markers are being well or better imputed but at the same time, even more markers with relatively low minor allele frequency are being imputed with low imputation quality. Here, we propose new methods that incorporate imputation uncertainty for downstream association analysis, with improved power and/or computational efficiency. We consider two scenarios: I) when posterior probabilities of all potential genotypes are estimated; and II) when only the one-dimensional summary statistic, imputed dosage, is available. For scenario I, we have developed an expectation-maximization likelihood-ratio test for association based on posterior probabilities. When only imputed dosages are available (scenario II), we first sample the genotype probabilities from its posterior distribution given the dosages, and then apply the EM-LRT on the sampled probabilities. Our simulations show that type I error of the proposed EM-LRT methods under both scenarios are protected. Compared with existing methods, EM-LRT-Prob (for scenario I) offers optimal statistical power across a wide spectrum of MAF and imputation quality. EM-LRT-Dose (for scenario II) achieves a similar level of statistical power as EM-LRT-Prob and, outperforms the standard Dosage method, especially for markers with relatively low MAF or imputation quality. Applications to two real data sets, the Cebu Longitudinal Health and Nutrition Survey study and the Women’s Health Initiative Study, provide further support to the validity and efficiency of our proposed methods.
format Online
Article
Text
id pubmed-4226494
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-42264942014-11-13 Association Studies with Imputed Variants Using Expectation-Maximization Likelihood-Ratio Tests Huang, Kuan-Chieh Sun, Wei Wu, Ying Chen, Mengjie Mohlke, Karen L. Lange, Leslie A. Li, Yun PLoS One Research Article Genotype imputation has become standard practice in modern genetic studies. As sequencing-based reference panels continue to grow, increasingly more markers are being well or better imputed but at the same time, even more markers with relatively low minor allele frequency are being imputed with low imputation quality. Here, we propose new methods that incorporate imputation uncertainty for downstream association analysis, with improved power and/or computational efficiency. We consider two scenarios: I) when posterior probabilities of all potential genotypes are estimated; and II) when only the one-dimensional summary statistic, imputed dosage, is available. For scenario I, we have developed an expectation-maximization likelihood-ratio test for association based on posterior probabilities. When only imputed dosages are available (scenario II), we first sample the genotype probabilities from its posterior distribution given the dosages, and then apply the EM-LRT on the sampled probabilities. Our simulations show that type I error of the proposed EM-LRT methods under both scenarios are protected. Compared with existing methods, EM-LRT-Prob (for scenario I) offers optimal statistical power across a wide spectrum of MAF and imputation quality. EM-LRT-Dose (for scenario II) achieves a similar level of statistical power as EM-LRT-Prob and, outperforms the standard Dosage method, especially for markers with relatively low MAF or imputation quality. Applications to two real data sets, the Cebu Longitudinal Health and Nutrition Survey study and the Women’s Health Initiative Study, provide further support to the validity and efficiency of our proposed methods. Public Library of Science 2014-11-10 /pmc/articles/PMC4226494/ /pubmed/25383782 http://dx.doi.org/10.1371/journal.pone.0110679 Text en © 2014 Huang et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Huang, Kuan-Chieh
Sun, Wei
Wu, Ying
Chen, Mengjie
Mohlke, Karen L.
Lange, Leslie A.
Li, Yun
Association Studies with Imputed Variants Using Expectation-Maximization Likelihood-Ratio Tests
title Association Studies with Imputed Variants Using Expectation-Maximization Likelihood-Ratio Tests
title_full Association Studies with Imputed Variants Using Expectation-Maximization Likelihood-Ratio Tests
title_fullStr Association Studies with Imputed Variants Using Expectation-Maximization Likelihood-Ratio Tests
title_full_unstemmed Association Studies with Imputed Variants Using Expectation-Maximization Likelihood-Ratio Tests
title_short Association Studies with Imputed Variants Using Expectation-Maximization Likelihood-Ratio Tests
title_sort association studies with imputed variants using expectation-maximization likelihood-ratio tests
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4226494/
https://www.ncbi.nlm.nih.gov/pubmed/25383782
http://dx.doi.org/10.1371/journal.pone.0110679
work_keys_str_mv AT huangkuanchieh associationstudieswithimputedvariantsusingexpectationmaximizationlikelihoodratiotests
AT sunwei associationstudieswithimputedvariantsusingexpectationmaximizationlikelihoodratiotests
AT wuying associationstudieswithimputedvariantsusingexpectationmaximizationlikelihoodratiotests
AT chenmengjie associationstudieswithimputedvariantsusingexpectationmaximizationlikelihoodratiotests
AT mohlkekarenl associationstudieswithimputedvariantsusingexpectationmaximizationlikelihoodratiotests
AT langelesliea associationstudieswithimputedvariantsusingexpectationmaximizationlikelihoodratiotests
AT liyun associationstudieswithimputedvariantsusingexpectationmaximizationlikelihoodratiotests