Cargando…

Evaluation of sequencing strategies for whole-genome imputation with hybrid peeling

BACKGROUND: For assembling large whole-genome sequence datasets for routine use in research and breeding, the sequencing strategy should be adapted to the methods that will be used later for variant discovery and imputation. In this study, we used simulation to explore the impact that the sequencing...

Descripción completa

Detalles Bibliográficos
Autores principales: Ros-Freixedes, Roger, Whalen, Andrew, Gorjanc, Gregor, Mileham, Alan J., Hickey, John M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7132986/
https://www.ncbi.nlm.nih.gov/pubmed/32248818
http://dx.doi.org/10.1186/s12711-020-00537-7
_version_ 1783517541591678976
author Ros-Freixedes, Roger
Whalen, Andrew
Gorjanc, Gregor
Mileham, Alan J.
Hickey, John M.
author_facet Ros-Freixedes, Roger
Whalen, Andrew
Gorjanc, Gregor
Mileham, Alan J.
Hickey, John M.
author_sort Ros-Freixedes, Roger
collection PubMed
description BACKGROUND: For assembling large whole-genome sequence datasets for routine use in research and breeding, the sequencing strategy should be adapted to the methods that will be used later for variant discovery and imputation. In this study, we used simulation to explore the impact that the sequencing strategy and level of sequencing investment have on the overall accuracy of imputation using hybrid peeling, a pedigree-based imputation method that is well suited for large livestock populations. METHODS: We simulated marker array and whole-genome sequence data for 15 populations with simulated or real pedigrees that had different structures. In these populations, we evaluated the effect on imputation accuracy of seven methods for selecting which individuals to sequence, the generation of the pedigree to which the sequenced individuals belonged, the use of variable or uniform coverage, and the trade-off between the number of sequenced individuals and their sequencing coverage. For each population, we considered four levels of investment in sequencing that were proportional to the size of the population. RESULTS: Imputation accuracy depended greatly on pedigree depth. The distribution of the sequenced individuals across the generations of the pedigree underlay the performance of the different methods used to select individuals to sequence and it was critical for achieving high imputation accuracy in both early and late generations. Imputation accuracy was highest with a uniform coverage across the sequenced individuals of 2× rather than variable coverage. An investment equivalent to the cost of sequencing 2% of the population at 2× provided high imputation accuracy. The gain in imputation accuracy from additional investment decreased with larger populations and higher levels of investment. However, to achieve the same imputation accuracy, a proportionally greater investment must be used in the smaller populations compared to the larger ones. CONCLUSIONS: Suitable sequencing strategies for subsequent imputation with hybrid peeling involve sequencing ~2% of the population at a uniform coverage 2×, distributed preferably across all generations of the pedigree, except for the few earliest generations that lack genotyped ancestors. Such sequencing strategies are beneficial for generating whole-genome sequence data in populations with deep pedigrees of closely related individuals.
format Online
Article
Text
id pubmed-7132986
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-71329862020-04-11 Evaluation of sequencing strategies for whole-genome imputation with hybrid peeling Ros-Freixedes, Roger Whalen, Andrew Gorjanc, Gregor Mileham, Alan J. Hickey, John M. Genet Sel Evol Research Article BACKGROUND: For assembling large whole-genome sequence datasets for routine use in research and breeding, the sequencing strategy should be adapted to the methods that will be used later for variant discovery and imputation. In this study, we used simulation to explore the impact that the sequencing strategy and level of sequencing investment have on the overall accuracy of imputation using hybrid peeling, a pedigree-based imputation method that is well suited for large livestock populations. METHODS: We simulated marker array and whole-genome sequence data for 15 populations with simulated or real pedigrees that had different structures. In these populations, we evaluated the effect on imputation accuracy of seven methods for selecting which individuals to sequence, the generation of the pedigree to which the sequenced individuals belonged, the use of variable or uniform coverage, and the trade-off between the number of sequenced individuals and their sequencing coverage. For each population, we considered four levels of investment in sequencing that were proportional to the size of the population. RESULTS: Imputation accuracy depended greatly on pedigree depth. The distribution of the sequenced individuals across the generations of the pedigree underlay the performance of the different methods used to select individuals to sequence and it was critical for achieving high imputation accuracy in both early and late generations. Imputation accuracy was highest with a uniform coverage across the sequenced individuals of 2× rather than variable coverage. An investment equivalent to the cost of sequencing 2% of the population at 2× provided high imputation accuracy. The gain in imputation accuracy from additional investment decreased with larger populations and higher levels of investment. However, to achieve the same imputation accuracy, a proportionally greater investment must be used in the smaller populations compared to the larger ones. CONCLUSIONS: Suitable sequencing strategies for subsequent imputation with hybrid peeling involve sequencing ~2% of the population at a uniform coverage 2×, distributed preferably across all generations of the pedigree, except for the few earliest generations that lack genotyped ancestors. Such sequencing strategies are beneficial for generating whole-genome sequence data in populations with deep pedigrees of closely related individuals. BioMed Central 2020-04-06 /pmc/articles/PMC7132986/ /pubmed/32248818 http://dx.doi.org/10.1186/s12711-020-00537-7 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research Article
Ros-Freixedes, Roger
Whalen, Andrew
Gorjanc, Gregor
Mileham, Alan J.
Hickey, John M.
Evaluation of sequencing strategies for whole-genome imputation with hybrid peeling
title Evaluation of sequencing strategies for whole-genome imputation with hybrid peeling
title_full Evaluation of sequencing strategies for whole-genome imputation with hybrid peeling
title_fullStr Evaluation of sequencing strategies for whole-genome imputation with hybrid peeling
title_full_unstemmed Evaluation of sequencing strategies for whole-genome imputation with hybrid peeling
title_short Evaluation of sequencing strategies for whole-genome imputation with hybrid peeling
title_sort evaluation of sequencing strategies for whole-genome imputation with hybrid peeling
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7132986/
https://www.ncbi.nlm.nih.gov/pubmed/32248818
http://dx.doi.org/10.1186/s12711-020-00537-7
work_keys_str_mv AT rosfreixedesroger evaluationofsequencingstrategiesforwholegenomeimputationwithhybridpeeling
AT whalenandrew evaluationofsequencingstrategiesforwholegenomeimputationwithhybridpeeling
AT gorjancgregor evaluationofsequencingstrategiesforwholegenomeimputationwithhybridpeeling
AT milehamalanj evaluationofsequencingstrategiesforwholegenomeimputationwithhybridpeeling
AT hickeyjohnm evaluationofsequencingstrategiesforwholegenomeimputationwithhybridpeeling