Cargando…

A new strategy for enhancing imputation quality of rare variants from next-generation sequencing data via combining SNP and exome chip data

BACKGROUND: Rare variants have gathered increasing attention as a possible alternative source of missing heritability. Since next generation sequencing technology is not yet cost-effective for large-scale genomic studies, a widely used alternative approach is imputation. However, the imputation appr...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, Young Jin, Lee, Juyoung, Kim, Bong-Jo, Park, Taesung
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4696174/
https://www.ncbi.nlm.nih.gov/pubmed/26715385
http://dx.doi.org/10.1186/s12864-015-2192-y
_version_ 1782407742566170624
author Kim, Young Jin
Lee, Juyoung
Kim, Bong-Jo
Park, Taesung
author_facet Kim, Young Jin
Lee, Juyoung
Kim, Bong-Jo
Park, Taesung
author_sort Kim, Young Jin
collection PubMed
description BACKGROUND: Rare variants have gathered increasing attention as a possible alternative source of missing heritability. Since next generation sequencing technology is not yet cost-effective for large-scale genomic studies, a widely used alternative approach is imputation. However, the imputation approach may be limited by the low accuracy of the imputed rare variants. To improve imputation accuracy of rare variants, various approaches have been suggested, including increasing the sample size of the reference panel, using sequencing data from study-specific samples (i.e., specific populations), and using local reference panels by genotyping or sequencing a subset of study samples. While these approaches mainly utilize reference panels, imputation accuracy of rare variants can also be increased by using exome chips containing rare variants. The exome chip contains 250 K rare variants selected from the discovered variants of about 12,000 sequenced samples. If exome chip data are available for previously genotyped samples, the combined approach using a genotype panel of merged data, including exome chips and SNP chips, should increase the imputation accuracy of rare variants. RESULTS: In this study, we describe a combined imputation which uses both exome chip and SNP chip data simultaneously as a genotype panel. The effectiveness and performance of the combined approach was demonstrated using a reference panel of 848 samples constructed using exome sequencing data from the T2D-GENES consortium and 5,349 sample genotype panels consisting of an exome chip and SNP chip. As a result, the combined approach increased imputation quality up to 11 %, and genomic coverage for rare variants up to 117.7 % (MAF < 1 %), compared to imputation using the SNP chip alone. Also, we investigated the systematic effect of reference panels on imputation quality using five reference panels and three genotype panels. The best performing approach was the combination of the study specific reference panel and the genotype panel of combined data. CONCLUSIONS: Our study demonstrates that combined datasets, including SNP chips and exome chips, enhances both the imputation quality and genomic coverage of rare variants.
format Online
Article
Text
id pubmed-4696174
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-46961742015-12-31 A new strategy for enhancing imputation quality of rare variants from next-generation sequencing data via combining SNP and exome chip data Kim, Young Jin Lee, Juyoung Kim, Bong-Jo Park, Taesung BMC Genomics Methodology Article BACKGROUND: Rare variants have gathered increasing attention as a possible alternative source of missing heritability. Since next generation sequencing technology is not yet cost-effective for large-scale genomic studies, a widely used alternative approach is imputation. However, the imputation approach may be limited by the low accuracy of the imputed rare variants. To improve imputation accuracy of rare variants, various approaches have been suggested, including increasing the sample size of the reference panel, using sequencing data from study-specific samples (i.e., specific populations), and using local reference panels by genotyping or sequencing a subset of study samples. While these approaches mainly utilize reference panels, imputation accuracy of rare variants can also be increased by using exome chips containing rare variants. The exome chip contains 250 K rare variants selected from the discovered variants of about 12,000 sequenced samples. If exome chip data are available for previously genotyped samples, the combined approach using a genotype panel of merged data, including exome chips and SNP chips, should increase the imputation accuracy of rare variants. RESULTS: In this study, we describe a combined imputation which uses both exome chip and SNP chip data simultaneously as a genotype panel. The effectiveness and performance of the combined approach was demonstrated using a reference panel of 848 samples constructed using exome sequencing data from the T2D-GENES consortium and 5,349 sample genotype panels consisting of an exome chip and SNP chip. As a result, the combined approach increased imputation quality up to 11 %, and genomic coverage for rare variants up to 117.7 % (MAF < 1 %), compared to imputation using the SNP chip alone. Also, we investigated the systematic effect of reference panels on imputation quality using five reference panels and three genotype panels. The best performing approach was the combination of the study specific reference panel and the genotype panel of combined data. CONCLUSIONS: Our study demonstrates that combined datasets, including SNP chips and exome chips, enhances both the imputation quality and genomic coverage of rare variants. BioMed Central 2015-12-29 /pmc/articles/PMC4696174/ /pubmed/26715385 http://dx.doi.org/10.1186/s12864-015-2192-y Text en © Kim et al. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Kim, Young Jin
Lee, Juyoung
Kim, Bong-Jo
Park, Taesung
A new strategy for enhancing imputation quality of rare variants from next-generation sequencing data via combining SNP and exome chip data
title A new strategy for enhancing imputation quality of rare variants from next-generation sequencing data via combining SNP and exome chip data
title_full A new strategy for enhancing imputation quality of rare variants from next-generation sequencing data via combining SNP and exome chip data
title_fullStr A new strategy for enhancing imputation quality of rare variants from next-generation sequencing data via combining SNP and exome chip data
title_full_unstemmed A new strategy for enhancing imputation quality of rare variants from next-generation sequencing data via combining SNP and exome chip data
title_short A new strategy for enhancing imputation quality of rare variants from next-generation sequencing data via combining SNP and exome chip data
title_sort new strategy for enhancing imputation quality of rare variants from next-generation sequencing data via combining snp and exome chip data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4696174/
https://www.ncbi.nlm.nih.gov/pubmed/26715385
http://dx.doi.org/10.1186/s12864-015-2192-y
work_keys_str_mv AT kimyoungjin anewstrategyforenhancingimputationqualityofrarevariantsfromnextgenerationsequencingdataviacombiningsnpandexomechipdata
AT leejuyoung anewstrategyforenhancingimputationqualityofrarevariantsfromnextgenerationsequencingdataviacombiningsnpandexomechipdata
AT kimbongjo anewstrategyforenhancingimputationqualityofrarevariantsfromnextgenerationsequencingdataviacombiningsnpandexomechipdata
AT anewstrategyforenhancingimputationqualityofrarevariantsfromnextgenerationsequencingdataviacombiningsnpandexomechipdata
AT parktaesung anewstrategyforenhancingimputationqualityofrarevariantsfromnextgenerationsequencingdataviacombiningsnpandexomechipdata
AT kimyoungjin newstrategyforenhancingimputationqualityofrarevariantsfromnextgenerationsequencingdataviacombiningsnpandexomechipdata
AT leejuyoung newstrategyforenhancingimputationqualityofrarevariantsfromnextgenerationsequencingdataviacombiningsnpandexomechipdata
AT kimbongjo newstrategyforenhancingimputationqualityofrarevariantsfromnextgenerationsequencingdataviacombiningsnpandexomechipdata
AT newstrategyforenhancingimputationqualityofrarevariantsfromnextgenerationsequencingdataviacombiningsnpandexomechipdata
AT parktaesung newstrategyforenhancingimputationqualityofrarevariantsfromnextgenerationsequencingdataviacombiningsnpandexomechipdata