Cargando…

Assessing accuracy of imputation using different SNP panel densities in a multi-breed sheep population

BACKGROUND: Genotype imputation is a key element of the implementation of genomic selection within the New Zealand sheep industry, but many factors can influence imputation accuracy. Our objective was to provide practical directions on the implementation of imputation strategies in a multi-breed she...

Descripción completa

Detalles Bibliográficos
Autores principales: Ventura, Ricardo V., Miller, Stephen P., Dodds, Ken G., Auvray, Benoit, Lee, Michael, Bixley, Matthew, Clarke, Shannon M., McEwan, John C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5035503/
https://www.ncbi.nlm.nih.gov/pubmed/27663120
http://dx.doi.org/10.1186/s12711-016-0244-7
_version_ 1782455426297626624
author Ventura, Ricardo V.
Miller, Stephen P.
Dodds, Ken G.
Auvray, Benoit
Lee, Michael
Bixley, Matthew
Clarke, Shannon M.
McEwan, John C.
author_facet Ventura, Ricardo V.
Miller, Stephen P.
Dodds, Ken G.
Auvray, Benoit
Lee, Michael
Bixley, Matthew
Clarke, Shannon M.
McEwan, John C.
author_sort Ventura, Ricardo V.
collection PubMed
description BACKGROUND: Genotype imputation is a key element of the implementation of genomic selection within the New Zealand sheep industry, but many factors can influence imputation accuracy. Our objective was to provide practical directions on the implementation of imputation strategies in a multi-breed sheep population genotyped with three single nucleotide polymorphism (SNP) panels: 5K, 50K and HD (600K SNPs). RESULTS: Imputation from 5K to HD was slightly better (0.6 %) than imputation from 5K to 50K. Two-step imputation from 5K to 50K and then from 50K to HD outperformed direct imputation from 5K to HD. A slight loss in imputation accuracy was observed when a large fixed reference population was used compared to a smaller within-breed reference (including all 50K genotypes on animals from different breeds excluding those in the validation set i.e. to be imputed), but only for a few animals across all imputation scenarios from 5K to 50K. However, a major gain in imputation accuracy for a large proportion of animals (purebred and crossbred), justified the use of a fixed and large reference dataset for all situations. This study also investigated the loss in imputation accuracy specifically for SNPs located at the ends of each chromosome, and showed that only chromosome 26 had an overall imputation (5K to 50K) accuracy for 100 SNPs at each end higher than 60 % (r(2)). Most of the chromosomes displayed reduced imputation accuracy at least at one of their ends. Prediction of imputation accuracy based on the relatedness of low-density genotypes to those of the reference dataset, before imputation (without running an imputation software) was also investigated. FIMPUTE V2.2 outperformed BEAGLE 3.3.2 across all imputation scenarios. CONCLUSIONS: Imputation accuracy in sheep breeds can be improved by following a set of recommendations on SNP panels, software, strategies of imputation (one- or two-step imputation), and choice of the animals to be genotyped using both high- and low-density SNP panels. We present a method that predicts imputation accuracy for individual animals at the low-density level, before running imputation, which can be used to restrict genomic prediction only to the animals that can be imputed with sufficient accuracy.
format Online
Article
Text
id pubmed-5035503
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-50355032016-09-29 Assessing accuracy of imputation using different SNP panel densities in a multi-breed sheep population Ventura, Ricardo V. Miller, Stephen P. Dodds, Ken G. Auvray, Benoit Lee, Michael Bixley, Matthew Clarke, Shannon M. McEwan, John C. Genet Sel Evol Research Article BACKGROUND: Genotype imputation is a key element of the implementation of genomic selection within the New Zealand sheep industry, but many factors can influence imputation accuracy. Our objective was to provide practical directions on the implementation of imputation strategies in a multi-breed sheep population genotyped with three single nucleotide polymorphism (SNP) panels: 5K, 50K and HD (600K SNPs). RESULTS: Imputation from 5K to HD was slightly better (0.6 %) than imputation from 5K to 50K. Two-step imputation from 5K to 50K and then from 50K to HD outperformed direct imputation from 5K to HD. A slight loss in imputation accuracy was observed when a large fixed reference population was used compared to a smaller within-breed reference (including all 50K genotypes on animals from different breeds excluding those in the validation set i.e. to be imputed), but only for a few animals across all imputation scenarios from 5K to 50K. However, a major gain in imputation accuracy for a large proportion of animals (purebred and crossbred), justified the use of a fixed and large reference dataset for all situations. This study also investigated the loss in imputation accuracy specifically for SNPs located at the ends of each chromosome, and showed that only chromosome 26 had an overall imputation (5K to 50K) accuracy for 100 SNPs at each end higher than 60 % (r(2)). Most of the chromosomes displayed reduced imputation accuracy at least at one of their ends. Prediction of imputation accuracy based on the relatedness of low-density genotypes to those of the reference dataset, before imputation (without running an imputation software) was also investigated. FIMPUTE V2.2 outperformed BEAGLE 3.3.2 across all imputation scenarios. CONCLUSIONS: Imputation accuracy in sheep breeds can be improved by following a set of recommendations on SNP panels, software, strategies of imputation (one- or two-step imputation), and choice of the animals to be genotyped using both high- and low-density SNP panels. We present a method that predicts imputation accuracy for individual animals at the low-density level, before running imputation, which can be used to restrict genomic prediction only to the animals that can be imputed with sufficient accuracy. BioMed Central 2016-09-23 /pmc/articles/PMC5035503/ /pubmed/27663120 http://dx.doi.org/10.1186/s12711-016-0244-7 Text en © The Author(s) 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Ventura, Ricardo V.
Miller, Stephen P.
Dodds, Ken G.
Auvray, Benoit
Lee, Michael
Bixley, Matthew
Clarke, Shannon M.
McEwan, John C.
Assessing accuracy of imputation using different SNP panel densities in a multi-breed sheep population
title Assessing accuracy of imputation using different SNP panel densities in a multi-breed sheep population
title_full Assessing accuracy of imputation using different SNP panel densities in a multi-breed sheep population
title_fullStr Assessing accuracy of imputation using different SNP panel densities in a multi-breed sheep population
title_full_unstemmed Assessing accuracy of imputation using different SNP panel densities in a multi-breed sheep population
title_short Assessing accuracy of imputation using different SNP panel densities in a multi-breed sheep population
title_sort assessing accuracy of imputation using different snp panel densities in a multi-breed sheep population
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5035503/
https://www.ncbi.nlm.nih.gov/pubmed/27663120
http://dx.doi.org/10.1186/s12711-016-0244-7
work_keys_str_mv AT venturaricardov assessingaccuracyofimputationusingdifferentsnppaneldensitiesinamultibreedsheeppopulation
AT millerstephenp assessingaccuracyofimputationusingdifferentsnppaneldensitiesinamultibreedsheeppopulation
AT doddskeng assessingaccuracyofimputationusingdifferentsnppaneldensitiesinamultibreedsheeppopulation
AT auvraybenoit assessingaccuracyofimputationusingdifferentsnppaneldensitiesinamultibreedsheeppopulation
AT leemichael assessingaccuracyofimputationusingdifferentsnppaneldensitiesinamultibreedsheeppopulation
AT bixleymatthew assessingaccuracyofimputationusingdifferentsnppaneldensitiesinamultibreedsheeppopulation
AT clarkeshannonm assessingaccuracyofimputationusingdifferentsnppaneldensitiesinamultibreedsheeppopulation
AT mcewanjohnc assessingaccuracyofimputationusingdifferentsnppaneldensitiesinamultibreedsheeppopulation