Cargando…

Hybrid peeling for fast and accurate calling, phasing, and imputation with sequence data of any coverage in pedigrees

BACKGROUND: In this paper, we extend multi-locus iterative peeling to provide a computationally efficient method for calling, phasing, and imputing sequence data of any coverage in small or large pedigrees. Our method, called hybrid peeling, uses multi-locus iterative peeling to estimate shared chro...

Descripción completa

Detalles Bibliográficos
Autores principales: Whalen, Andrew, Ros-Freixedes, Roger, Wilson, David L., Gorjanc, Gregor, Hickey, John M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6299538/
https://www.ncbi.nlm.nih.gov/pubmed/30563452
http://dx.doi.org/10.1186/s12711-018-0438-2
_version_ 1783381505230241792
author Whalen, Andrew
Ros-Freixedes, Roger
Wilson, David L.
Gorjanc, Gregor
Hickey, John M.
author_facet Whalen, Andrew
Ros-Freixedes, Roger
Wilson, David L.
Gorjanc, Gregor
Hickey, John M.
author_sort Whalen, Andrew
collection PubMed
description BACKGROUND: In this paper, we extend multi-locus iterative peeling to provide a computationally efficient method for calling, phasing, and imputing sequence data of any coverage in small or large pedigrees. Our method, called hybrid peeling, uses multi-locus iterative peeling to estimate shared chromosome segments between parents and their offspring at a subset of loci, and then uses single-locus iterative peeling to aggregate genomic information across multiple generations at the remaining loci. RESULTS: Using a synthetic dataset, we first analysed the performance of hybrid peeling for calling and phasing genotypes in disconnected families, which contained only a focal individual and its parents and grandparents. Second, we analysed the performance of hybrid peeling for calling and phasing genotypes in the context of a full general pedigree. Third, we analysed the performance of hybrid peeling for imputing whole-genome sequence data to non-sequenced individuals in the population. We found that hybrid peeling substantially increased the number of called and phased genotypes by leveraging sequence information on related individuals. The calling rate and accuracy increased when the full pedigree was used compared to a reduced pedigree of just parents and grandparents. Finally, hybrid peeling imputed accurately whole-genome sequence to non-sequenced individuals. CONCLUSIONS: We believe that this algorithm will enable the generation of low cost and high accuracy whole-genome sequence data in many pedigreed populations. We make this algorithm available as a standalone program called AlphaPeel.
format Online
Article
Text
id pubmed-6299538
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-62995382018-12-20 Hybrid peeling for fast and accurate calling, phasing, and imputation with sequence data of any coverage in pedigrees Whalen, Andrew Ros-Freixedes, Roger Wilson, David L. Gorjanc, Gregor Hickey, John M. Genet Sel Evol Research Article BACKGROUND: In this paper, we extend multi-locus iterative peeling to provide a computationally efficient method for calling, phasing, and imputing sequence data of any coverage in small or large pedigrees. Our method, called hybrid peeling, uses multi-locus iterative peeling to estimate shared chromosome segments between parents and their offspring at a subset of loci, and then uses single-locus iterative peeling to aggregate genomic information across multiple generations at the remaining loci. RESULTS: Using a synthetic dataset, we first analysed the performance of hybrid peeling for calling and phasing genotypes in disconnected families, which contained only a focal individual and its parents and grandparents. Second, we analysed the performance of hybrid peeling for calling and phasing genotypes in the context of a full general pedigree. Third, we analysed the performance of hybrid peeling for imputing whole-genome sequence data to non-sequenced individuals in the population. We found that hybrid peeling substantially increased the number of called and phased genotypes by leveraging sequence information on related individuals. The calling rate and accuracy increased when the full pedigree was used compared to a reduced pedigree of just parents and grandparents. Finally, hybrid peeling imputed accurately whole-genome sequence to non-sequenced individuals. CONCLUSIONS: We believe that this algorithm will enable the generation of low cost and high accuracy whole-genome sequence data in many pedigreed populations. We make this algorithm available as a standalone program called AlphaPeel. BioMed Central 2018-12-18 /pmc/articles/PMC6299538/ /pubmed/30563452 http://dx.doi.org/10.1186/s12711-018-0438-2 Text en © The Author(s) 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Whalen, Andrew
Ros-Freixedes, Roger
Wilson, David L.
Gorjanc, Gregor
Hickey, John M.
Hybrid peeling for fast and accurate calling, phasing, and imputation with sequence data of any coverage in pedigrees
title Hybrid peeling for fast and accurate calling, phasing, and imputation with sequence data of any coverage in pedigrees
title_full Hybrid peeling for fast and accurate calling, phasing, and imputation with sequence data of any coverage in pedigrees
title_fullStr Hybrid peeling for fast and accurate calling, phasing, and imputation with sequence data of any coverage in pedigrees
title_full_unstemmed Hybrid peeling for fast and accurate calling, phasing, and imputation with sequence data of any coverage in pedigrees
title_short Hybrid peeling for fast and accurate calling, phasing, and imputation with sequence data of any coverage in pedigrees
title_sort hybrid peeling for fast and accurate calling, phasing, and imputation with sequence data of any coverage in pedigrees
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6299538/
https://www.ncbi.nlm.nih.gov/pubmed/30563452
http://dx.doi.org/10.1186/s12711-018-0438-2
work_keys_str_mv AT whalenandrew hybridpeelingforfastandaccuratecallingphasingandimputationwithsequencedataofanycoverageinpedigrees
AT rosfreixedesroger hybridpeelingforfastandaccuratecallingphasingandimputationwithsequencedataofanycoverageinpedigrees
AT wilsondavidl hybridpeelingforfastandaccuratecallingphasingandimputationwithsequencedataofanycoverageinpedigrees
AT gorjancgregor hybridpeelingforfastandaccuratecallingphasingandimputationwithsequencedataofanycoverageinpedigrees
AT hickeyjohnm hybridpeelingforfastandaccuratecallingphasingandimputationwithsequencedataofanycoverageinpedigrees