Cargando…
Association Analysis and Meta-Analysis of Multi-Allelic Variants for Large-Scale Sequence Data
There is great interest in understanding the impact of rare variants in human diseases using large sequence datasets. In deep sequence datasets of >10,000 samples, ~10% of the variant sites are observed to be multi-allelic. Many of the multi-allelic variants have been shown to be functional and d...
Autores principales: | , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7288273/ https://www.ncbi.nlm.nih.gov/pubmed/32466134 http://dx.doi.org/10.3390/genes11050586 |
_version_ | 1783545240823529472 |
---|---|
author | Jiang, Yu Chen, Sai Wang, Xingyan Liu, Mengzhen Iacono, William G. Hewitt, John K. Hokanson, John E. Krauter, Kenneth Laakso, Markku Li, Kevin W. Lutz, Sharon M. McGue, Matthew Pandit, Anita Zajac, Gregory J.M. Boehnke, Michael Abecasis, Goncalo R. Vrieze, Scott I. Jiang, Bibo Zhan, Xiaowei Liu, Dajiang J. |
author_facet | Jiang, Yu Chen, Sai Wang, Xingyan Liu, Mengzhen Iacono, William G. Hewitt, John K. Hokanson, John E. Krauter, Kenneth Laakso, Markku Li, Kevin W. Lutz, Sharon M. McGue, Matthew Pandit, Anita Zajac, Gregory J.M. Boehnke, Michael Abecasis, Goncalo R. Vrieze, Scott I. Jiang, Bibo Zhan, Xiaowei Liu, Dajiang J. |
author_sort | Jiang, Yu |
collection | PubMed |
description | There is great interest in understanding the impact of rare variants in human diseases using large sequence datasets. In deep sequence datasets of >10,000 samples, ~10% of the variant sites are observed to be multi-allelic. Many of the multi-allelic variants have been shown to be functional and disease-relevant. Proper analysis of multi-allelic variants is critical to the success of a sequencing study, but existing methods do not properly handle multi-allelic variants and can produce highly misleading association results. We discuss practical issues and methods to encode multi-allelic sites, conduct single-variant and gene-level association analyses, and perform meta-analysis for multi-allelic variants. We evaluated these methods through extensive simulations and the study of a large meta-analysis of ~18,000 samples on the cigarettes-per-day phenotype. We showed that our joint modeling approach provided an unbiased estimate of genetic effects, greatly improved the power of single-variant association tests among methods that can properly estimate allele effects, and enhanced gene-level tests over existing approaches. Software packages implementing these methods are available online. |
format | Online Article Text |
id | pubmed-7288273 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-72882732020-06-17 Association Analysis and Meta-Analysis of Multi-Allelic Variants for Large-Scale Sequence Data Jiang, Yu Chen, Sai Wang, Xingyan Liu, Mengzhen Iacono, William G. Hewitt, John K. Hokanson, John E. Krauter, Kenneth Laakso, Markku Li, Kevin W. Lutz, Sharon M. McGue, Matthew Pandit, Anita Zajac, Gregory J.M. Boehnke, Michael Abecasis, Goncalo R. Vrieze, Scott I. Jiang, Bibo Zhan, Xiaowei Liu, Dajiang J. Genes (Basel) Article There is great interest in understanding the impact of rare variants in human diseases using large sequence datasets. In deep sequence datasets of >10,000 samples, ~10% of the variant sites are observed to be multi-allelic. Many of the multi-allelic variants have been shown to be functional and disease-relevant. Proper analysis of multi-allelic variants is critical to the success of a sequencing study, but existing methods do not properly handle multi-allelic variants and can produce highly misleading association results. We discuss practical issues and methods to encode multi-allelic sites, conduct single-variant and gene-level association analyses, and perform meta-analysis for multi-allelic variants. We evaluated these methods through extensive simulations and the study of a large meta-analysis of ~18,000 samples on the cigarettes-per-day phenotype. We showed that our joint modeling approach provided an unbiased estimate of genetic effects, greatly improved the power of single-variant association tests among methods that can properly estimate allele effects, and enhanced gene-level tests over existing approaches. Software packages implementing these methods are available online. MDPI 2020-05-25 /pmc/articles/PMC7288273/ /pubmed/32466134 http://dx.doi.org/10.3390/genes11050586 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Jiang, Yu Chen, Sai Wang, Xingyan Liu, Mengzhen Iacono, William G. Hewitt, John K. Hokanson, John E. Krauter, Kenneth Laakso, Markku Li, Kevin W. Lutz, Sharon M. McGue, Matthew Pandit, Anita Zajac, Gregory J.M. Boehnke, Michael Abecasis, Goncalo R. Vrieze, Scott I. Jiang, Bibo Zhan, Xiaowei Liu, Dajiang J. Association Analysis and Meta-Analysis of Multi-Allelic Variants for Large-Scale Sequence Data |
title | Association Analysis and Meta-Analysis of Multi-Allelic Variants for Large-Scale Sequence Data |
title_full | Association Analysis and Meta-Analysis of Multi-Allelic Variants for Large-Scale Sequence Data |
title_fullStr | Association Analysis and Meta-Analysis of Multi-Allelic Variants for Large-Scale Sequence Data |
title_full_unstemmed | Association Analysis and Meta-Analysis of Multi-Allelic Variants for Large-Scale Sequence Data |
title_short | Association Analysis and Meta-Analysis of Multi-Allelic Variants for Large-Scale Sequence Data |
title_sort | association analysis and meta-analysis of multi-allelic variants for large-scale sequence data |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7288273/ https://www.ncbi.nlm.nih.gov/pubmed/32466134 http://dx.doi.org/10.3390/genes11050586 |
work_keys_str_mv | AT jiangyu associationanalysisandmetaanalysisofmultiallelicvariantsforlargescalesequencedata AT chensai associationanalysisandmetaanalysisofmultiallelicvariantsforlargescalesequencedata AT wangxingyan associationanalysisandmetaanalysisofmultiallelicvariantsforlargescalesequencedata AT liumengzhen associationanalysisandmetaanalysisofmultiallelicvariantsforlargescalesequencedata AT iaconowilliamg associationanalysisandmetaanalysisofmultiallelicvariantsforlargescalesequencedata AT hewittjohnk associationanalysisandmetaanalysisofmultiallelicvariantsforlargescalesequencedata AT hokansonjohne associationanalysisandmetaanalysisofmultiallelicvariantsforlargescalesequencedata AT krauterkenneth associationanalysisandmetaanalysisofmultiallelicvariantsforlargescalesequencedata AT laaksomarkku associationanalysisandmetaanalysisofmultiallelicvariantsforlargescalesequencedata AT likevinw associationanalysisandmetaanalysisofmultiallelicvariantsforlargescalesequencedata AT lutzsharonm associationanalysisandmetaanalysisofmultiallelicvariantsforlargescalesequencedata AT mcguematthew associationanalysisandmetaanalysisofmultiallelicvariantsforlargescalesequencedata AT panditanita associationanalysisandmetaanalysisofmultiallelicvariantsforlargescalesequencedata AT zajacgregoryjm associationanalysisandmetaanalysisofmultiallelicvariantsforlargescalesequencedata AT boehnkemichael associationanalysisandmetaanalysisofmultiallelicvariantsforlargescalesequencedata AT abecasisgoncalor associationanalysisandmetaanalysisofmultiallelicvariantsforlargescalesequencedata AT vriezescotti associationanalysisandmetaanalysisofmultiallelicvariantsforlargescalesequencedata AT jiangbibo associationanalysisandmetaanalysisofmultiallelicvariantsforlargescalesequencedata AT zhanxiaowei associationanalysisandmetaanalysisofmultiallelicvariantsforlargescalesequencedata AT liudajiangj associationanalysisandmetaanalysisofmultiallelicvariantsforlargescalesequencedata |