Cargando…

Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel

Imputing genotypes from reference panels created by whole-genome sequencing (WGS) provides a cost-effective strategy for augmenting the single-nucleotide polymorphism (SNP) content of genome-wide arrays. The UK10K Cohorts project has generated a data set of 3,781 whole genomes sequenced at low depth...

Descripción completa

Detalles Bibliográficos
Autores principales: Huang, Jie, Howie, Bryan, McCarthy, Shane, Memari, Yasin, Walter, Klaudia, Min, Josine L., Danecek, Petr, Malerba, Giovanni, Trabetti, Elisabetta, Zheng, Hou-Feng, Gambaro, Giovanni, Richards, J. Brent, Durbin, Richard, Timpson, Nicholas J., Marchini, Jonathan, Soranzo, Nicole
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Pub. Group 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4579394/
https://www.ncbi.nlm.nih.gov/pubmed/26368830
http://dx.doi.org/10.1038/ncomms9111
_version_ 1782391261073768448
author Huang, Jie
Howie, Bryan
McCarthy, Shane
Memari, Yasin
Walter, Klaudia
Min, Josine L.
Danecek, Petr
Malerba, Giovanni
Trabetti, Elisabetta
Zheng, Hou-Feng
Gambaro, Giovanni
Richards, J. Brent
Durbin, Richard
Timpson, Nicholas J.
Marchini, Jonathan
Soranzo, Nicole
author_facet Huang, Jie
Howie, Bryan
McCarthy, Shane
Memari, Yasin
Walter, Klaudia
Min, Josine L.
Danecek, Petr
Malerba, Giovanni
Trabetti, Elisabetta
Zheng, Hou-Feng
Gambaro, Giovanni
Richards, J. Brent
Durbin, Richard
Timpson, Nicholas J.
Marchini, Jonathan
Soranzo, Nicole
author_sort Huang, Jie
collection PubMed
description Imputing genotypes from reference panels created by whole-genome sequencing (WGS) provides a cost-effective strategy for augmenting the single-nucleotide polymorphism (SNP) content of genome-wide arrays. The UK10K Cohorts project has generated a data set of 3,781 whole genomes sequenced at low depth (average 7x), aiming to exhaustively characterize genetic variation down to 0.1% minor allele frequency in the British population. Here we demonstrate the value of this resource for improving imputation accuracy at rare and low-frequency variants in both a UK and an Italian population. We show that large increases in imputation accuracy can be achieved by re-phasing WGS reference panels after initial genotype calling. We also present a method for combining WGS panels to improve variant coverage and downstream imputation accuracy, which we illustrate by integrating 7,562 WGS haplotypes from the UK10K project with 2,184 haplotypes from the 1000 Genomes Project. Finally, we introduce a novel approximation that maintains speed without sacrificing imputation accuracy for rare variants.
format Online
Article
Text
id pubmed-4579394
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Nature Pub. Group
record_format MEDLINE/PubMed
spelling pubmed-45793942015-10-01 Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel Huang, Jie Howie, Bryan McCarthy, Shane Memari, Yasin Walter, Klaudia Min, Josine L. Danecek, Petr Malerba, Giovanni Trabetti, Elisabetta Zheng, Hou-Feng Gambaro, Giovanni Richards, J. Brent Durbin, Richard Timpson, Nicholas J. Marchini, Jonathan Soranzo, Nicole Nat Commun Article Imputing genotypes from reference panels created by whole-genome sequencing (WGS) provides a cost-effective strategy for augmenting the single-nucleotide polymorphism (SNP) content of genome-wide arrays. The UK10K Cohorts project has generated a data set of 3,781 whole genomes sequenced at low depth (average 7x), aiming to exhaustively characterize genetic variation down to 0.1% minor allele frequency in the British population. Here we demonstrate the value of this resource for improving imputation accuracy at rare and low-frequency variants in both a UK and an Italian population. We show that large increases in imputation accuracy can be achieved by re-phasing WGS reference panels after initial genotype calling. We also present a method for combining WGS panels to improve variant coverage and downstream imputation accuracy, which we illustrate by integrating 7,562 WGS haplotypes from the UK10K project with 2,184 haplotypes from the 1000 Genomes Project. Finally, we introduce a novel approximation that maintains speed without sacrificing imputation accuracy for rare variants. Nature Pub. Group 2015-09-14 /pmc/articles/PMC4579394/ /pubmed/26368830 http://dx.doi.org/10.1038/ncomms9111 Text en Copyright © 2015, Nature Publishing Group, a division of Macmillan Publishers Limited. All Rights Reserved. http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle Article
Huang, Jie
Howie, Bryan
McCarthy, Shane
Memari, Yasin
Walter, Klaudia
Min, Josine L.
Danecek, Petr
Malerba, Giovanni
Trabetti, Elisabetta
Zheng, Hou-Feng
Gambaro, Giovanni
Richards, J. Brent
Durbin, Richard
Timpson, Nicholas J.
Marchini, Jonathan
Soranzo, Nicole
Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel
title Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel
title_full Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel
title_fullStr Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel
title_full_unstemmed Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel
title_short Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel
title_sort improved imputation of low-frequency and rare variants using the uk10k haplotype reference panel
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4579394/
https://www.ncbi.nlm.nih.gov/pubmed/26368830
http://dx.doi.org/10.1038/ncomms9111
work_keys_str_mv AT huangjie improvedimputationoflowfrequencyandrarevariantsusingtheuk10khaplotypereferencepanel
AT howiebryan improvedimputationoflowfrequencyandrarevariantsusingtheuk10khaplotypereferencepanel
AT mccarthyshane improvedimputationoflowfrequencyandrarevariantsusingtheuk10khaplotypereferencepanel
AT memariyasin improvedimputationoflowfrequencyandrarevariantsusingtheuk10khaplotypereferencepanel
AT walterklaudia improvedimputationoflowfrequencyandrarevariantsusingtheuk10khaplotypereferencepanel
AT minjosinel improvedimputationoflowfrequencyandrarevariantsusingtheuk10khaplotypereferencepanel
AT danecekpetr improvedimputationoflowfrequencyandrarevariantsusingtheuk10khaplotypereferencepanel
AT malerbagiovanni improvedimputationoflowfrequencyandrarevariantsusingtheuk10khaplotypereferencepanel
AT trabettielisabetta improvedimputationoflowfrequencyandrarevariantsusingtheuk10khaplotypereferencepanel
AT zhenghoufeng improvedimputationoflowfrequencyandrarevariantsusingtheuk10khaplotypereferencepanel
AT improvedimputationoflowfrequencyandrarevariantsusingtheuk10khaplotypereferencepanel
AT gambarogiovanni improvedimputationoflowfrequencyandrarevariantsusingtheuk10khaplotypereferencepanel
AT richardsjbrent improvedimputationoflowfrequencyandrarevariantsusingtheuk10khaplotypereferencepanel
AT durbinrichard improvedimputationoflowfrequencyandrarevariantsusingtheuk10khaplotypereferencepanel
AT timpsonnicholasj improvedimputationoflowfrequencyandrarevariantsusingtheuk10khaplotypereferencepanel
AT marchinijonathan improvedimputationoflowfrequencyandrarevariantsusingtheuk10khaplotypereferencepanel
AT soranzonicole improvedimputationoflowfrequencyandrarevariantsusingtheuk10khaplotypereferencepanel