Cargando…

Genomic prediction based on selected variants from imputed whole-genome sequence data in Australian sheep populations

BACKGROUND: Whole-genome sequence (WGS) data could contain information on genetic variants at or in high linkage disequilibrium with causative mutations that underlie the genetic variation of polygenic traits. Thus far, genomic prediction accuracy has shown limited increase when using such informati...

Descripción completa

Detalles Bibliográficos
Autores principales: Moghaddar, Nasir, Khansefid, Majid, van der Werf, Julius H. J., Bolormaa, Sunduimijid, Duijvesteijn, Naomi, Clark, Samuel A., Swan, Andrew A., Daetwyler, Hans D., MacLeod, Iona M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6896509/
https://www.ncbi.nlm.nih.gov/pubmed/31805849
http://dx.doi.org/10.1186/s12711-019-0514-2
_version_ 1783476793611649024
author Moghaddar, Nasir
Khansefid, Majid
van der Werf, Julius H. J.
Bolormaa, Sunduimijid
Duijvesteijn, Naomi
Clark, Samuel A.
Swan, Andrew A.
Daetwyler, Hans D.
MacLeod, Iona M.
author_facet Moghaddar, Nasir
Khansefid, Majid
van der Werf, Julius H. J.
Bolormaa, Sunduimijid
Duijvesteijn, Naomi
Clark, Samuel A.
Swan, Andrew A.
Daetwyler, Hans D.
MacLeod, Iona M.
author_sort Moghaddar, Nasir
collection PubMed
description BACKGROUND: Whole-genome sequence (WGS) data could contain information on genetic variants at or in high linkage disequilibrium with causative mutations that underlie the genetic variation of polygenic traits. Thus far, genomic prediction accuracy has shown limited increase when using such information in dairy cattle studies, in which one or few breeds with limited diversity predominate. The objective of our study was to evaluate the accuracy of genomic prediction in a multi-breed Australian sheep population of relatively less related target individuals, when using information on imputed WGS genotypes. METHODS: Between 9626 and 26,657 animals with phenotypes were available for nine economically important sheep production traits and all had WGS imputed genotypes. About 30% of the data were used to discover predictive single nucleotide polymorphism (SNPs) based on a genome-wide association study (GWAS) and the remaining data were used for training and validation of genomic prediction. Prediction accuracy using selected variants from imputed sequence data was compared to that using a standard array of 50k SNP genotypes, thereby comparing genomic best linear prediction (GBLUP) and Bayesian methods (BayesR/BayesRC). Accuracy of genomic prediction was evaluated in two independent populations that were each lowly related to the training set, one being purebred Merino and the other crossbred Border Leicester x Merino sheep. RESULTS: A substantial improvement in prediction accuracy was observed when selected sequence variants were fitted alongside 50k genotypes as a separate variance component in GBLUP (2GBLUP) or in Bayesian analysis as a separate category of SNPs (BayesRC). From an average accuracy of 0.27 in both validation sets for the 50k array, the average absolute increase in accuracy across traits with 2GBLUP was 0.083 and 0.073 for purebred and crossbred animals, respectively, whereas with BayesRC it was 0.102 and 0.087. The average gain in accuracy was smaller when selected sequence variants were treated in the same category as 50k SNPs. Very little improvement over 50k prediction was observed when using all WGS variants. CONCLUSIONS: Accuracy of genomic prediction in diverse sheep populations increased substantially by using variants selected from whole-genome sequence data based on an independent multi-breed GWAS, when compared to genomic prediction using standard 50K genotypes.
format Online
Article
Text
id pubmed-6896509
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-68965092019-12-11 Genomic prediction based on selected variants from imputed whole-genome sequence data in Australian sheep populations Moghaddar, Nasir Khansefid, Majid van der Werf, Julius H. J. Bolormaa, Sunduimijid Duijvesteijn, Naomi Clark, Samuel A. Swan, Andrew A. Daetwyler, Hans D. MacLeod, Iona M. Genet Sel Evol Research Article BACKGROUND: Whole-genome sequence (WGS) data could contain information on genetic variants at or in high linkage disequilibrium with causative mutations that underlie the genetic variation of polygenic traits. Thus far, genomic prediction accuracy has shown limited increase when using such information in dairy cattle studies, in which one or few breeds with limited diversity predominate. The objective of our study was to evaluate the accuracy of genomic prediction in a multi-breed Australian sheep population of relatively less related target individuals, when using information on imputed WGS genotypes. METHODS: Between 9626 and 26,657 animals with phenotypes were available for nine economically important sheep production traits and all had WGS imputed genotypes. About 30% of the data were used to discover predictive single nucleotide polymorphism (SNPs) based on a genome-wide association study (GWAS) and the remaining data were used for training and validation of genomic prediction. Prediction accuracy using selected variants from imputed sequence data was compared to that using a standard array of 50k SNP genotypes, thereby comparing genomic best linear prediction (GBLUP) and Bayesian methods (BayesR/BayesRC). Accuracy of genomic prediction was evaluated in two independent populations that were each lowly related to the training set, one being purebred Merino and the other crossbred Border Leicester x Merino sheep. RESULTS: A substantial improvement in prediction accuracy was observed when selected sequence variants were fitted alongside 50k genotypes as a separate variance component in GBLUP (2GBLUP) or in Bayesian analysis as a separate category of SNPs (BayesRC). From an average accuracy of 0.27 in both validation sets for the 50k array, the average absolute increase in accuracy across traits with 2GBLUP was 0.083 and 0.073 for purebred and crossbred animals, respectively, whereas with BayesRC it was 0.102 and 0.087. The average gain in accuracy was smaller when selected sequence variants were treated in the same category as 50k SNPs. Very little improvement over 50k prediction was observed when using all WGS variants. CONCLUSIONS: Accuracy of genomic prediction in diverse sheep populations increased substantially by using variants selected from whole-genome sequence data based on an independent multi-breed GWAS, when compared to genomic prediction using standard 50K genotypes. BioMed Central 2019-12-05 /pmc/articles/PMC6896509/ /pubmed/31805849 http://dx.doi.org/10.1186/s12711-019-0514-2 Text en © The Author(s) 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Moghaddar, Nasir
Khansefid, Majid
van der Werf, Julius H. J.
Bolormaa, Sunduimijid
Duijvesteijn, Naomi
Clark, Samuel A.
Swan, Andrew A.
Daetwyler, Hans D.
MacLeod, Iona M.
Genomic prediction based on selected variants from imputed whole-genome sequence data in Australian sheep populations
title Genomic prediction based on selected variants from imputed whole-genome sequence data in Australian sheep populations
title_full Genomic prediction based on selected variants from imputed whole-genome sequence data in Australian sheep populations
title_fullStr Genomic prediction based on selected variants from imputed whole-genome sequence data in Australian sheep populations
title_full_unstemmed Genomic prediction based on selected variants from imputed whole-genome sequence data in Australian sheep populations
title_short Genomic prediction based on selected variants from imputed whole-genome sequence data in Australian sheep populations
title_sort genomic prediction based on selected variants from imputed whole-genome sequence data in australian sheep populations
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6896509/
https://www.ncbi.nlm.nih.gov/pubmed/31805849
http://dx.doi.org/10.1186/s12711-019-0514-2
work_keys_str_mv AT moghaddarnasir genomicpredictionbasedonselectedvariantsfromimputedwholegenomesequencedatainaustraliansheeppopulations
AT khansefidmajid genomicpredictionbasedonselectedvariantsfromimputedwholegenomesequencedatainaustraliansheeppopulations
AT vanderwerfjuliushj genomicpredictionbasedonselectedvariantsfromimputedwholegenomesequencedatainaustraliansheeppopulations
AT bolormaasunduimijid genomicpredictionbasedonselectedvariantsfromimputedwholegenomesequencedatainaustraliansheeppopulations
AT duijvesteijnnaomi genomicpredictionbasedonselectedvariantsfromimputedwholegenomesequencedatainaustraliansheeppopulations
AT clarksamuela genomicpredictionbasedonselectedvariantsfromimputedwholegenomesequencedatainaustraliansheeppopulations
AT swanandrewa genomicpredictionbasedonselectedvariantsfromimputedwholegenomesequencedatainaustraliansheeppopulations
AT daetwylerhansd genomicpredictionbasedonselectedvariantsfromimputedwholegenomesequencedatainaustraliansheeppopulations
AT macleodionam genomicpredictionbasedonselectedvariantsfromimputedwholegenomesequencedatainaustraliansheeppopulations