Cargando…

On the use of whole-genome sequence data for across-breed genomic prediction and fine-scale mapping of QTL

BACKGROUND: Whole-genome sequence (WGS) data are increasingly available on large numbers of individuals in animal and plant breeding and in human genetics through second-generation resequencing technologies, 1000 genomes projects, and large-scale genotype imputation from lower marker densities. Here...

Descripción completa

Detalles Bibliográficos
Autores principales:	Meuwissen, Theo, van den Berg, Irene, Goddard, Mike
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2021
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7908738/ https://www.ncbi.nlm.nih.gov/pubmed/33637049 http://dx.doi.org/10.1186/s12711-021-00607-4

_version_	1783655781335302144
author	Meuwissen, Theo van den Berg, Irene Goddard, Mike
author_facet	Meuwissen, Theo van den Berg, Irene Goddard, Mike
author_sort	Meuwissen, Theo
collection	PubMed
description	BACKGROUND: Whole-genome sequence (WGS) data are increasingly available on large numbers of individuals in animal and plant breeding and in human genetics through second-generation resequencing technologies, 1000 genomes projects, and large-scale genotype imputation from lower marker densities. Here, we present a computationally fast implementation of a variable selection genomic prediction method, that could handle WGS data on more than 35,000 individuals, test its accuracy for across-breed predictions and assess its quantitative trait locus (QTL) mapping precision. METHODS: The Monte Carlo Markov chain (MCMC) variable selection model (Bayes GC) fits simultaneously a genomic best linear unbiased prediction (GBLUP) term, i.e. a polygenic effect whose correlations are described by a genomic relationship matrix (G), and a Bayes C term, i.e. a set of single nucleotide polymorphisms (SNPs) with large effects selected by the model. Computational speed is improved by a Metropolis–Hastings sampling that directs computations to the SNPs, which are, a priori, most likely to be included into the model. Speed is also improved by running many relatively short MCMC chains. Memory requirements are reduced by storing the genotype matrix in binary form. The model was tested on a WGS dataset containing Holstein, Jersey and Australian Red cattle. The data contained 4,809,520 genotypes on 35,549 individuals together with their milk, fat and protein yields, and fat and protein percentage traits. RESULTS: The prediction accuracies of the Jersey individuals improved by 1.5% when using across-breed GBLUP compared to within-breed predictions. Using WGS instead of 600 k SNP-chip data yielded on average a 3% accuracy improvement for Australian Red cows. QTL were fine-mapped by locating the SNP with the highest posterior probability of being included in the model. Various QTL known from the literature were rediscovered, and a new SNP affecting milk production was discovered on chromosome 20 at 34.501126 Mb. Due to the high mapping precision, it was clear that many of the discovered QTL were the same across the five dairy traits. CONCLUSIONS: Across-breed Bayes GC genomic prediction improved prediction accuracies compared to GBLUP. The combination of across-breed WGS data and Bayesian genomic prediction proved remarkably effective for the fine-mapping of QTL.
format	Online Article Text
id	pubmed-7908738
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-79087382021-02-26 On the use of whole-genome sequence data for across-breed genomic prediction and fine-scale mapping of QTL Meuwissen, Theo van den Berg, Irene Goddard, Mike Genet Sel Evol Research Article BACKGROUND: Whole-genome sequence (WGS) data are increasingly available on large numbers of individuals in animal and plant breeding and in human genetics through second-generation resequencing technologies, 1000 genomes projects, and large-scale genotype imputation from lower marker densities. Here, we present a computationally fast implementation of a variable selection genomic prediction method, that could handle WGS data on more than 35,000 individuals, test its accuracy for across-breed predictions and assess its quantitative trait locus (QTL) mapping precision. METHODS: The Monte Carlo Markov chain (MCMC) variable selection model (Bayes GC) fits simultaneously a genomic best linear unbiased prediction (GBLUP) term, i.e. a polygenic effect whose correlations are described by a genomic relationship matrix (G), and a Bayes C term, i.e. a set of single nucleotide polymorphisms (SNPs) with large effects selected by the model. Computational speed is improved by a Metropolis–Hastings sampling that directs computations to the SNPs, which are, a priori, most likely to be included into the model. Speed is also improved by running many relatively short MCMC chains. Memory requirements are reduced by storing the genotype matrix in binary form. The model was tested on a WGS dataset containing Holstein, Jersey and Australian Red cattle. The data contained 4,809,520 genotypes on 35,549 individuals together with their milk, fat and protein yields, and fat and protein percentage traits. RESULTS: The prediction accuracies of the Jersey individuals improved by 1.5% when using across-breed GBLUP compared to within-breed predictions. Using WGS instead of 600 k SNP-chip data yielded on average a 3% accuracy improvement for Australian Red cows. QTL were fine-mapped by locating the SNP with the highest posterior probability of being included in the model. Various QTL known from the literature were rediscovered, and a new SNP affecting milk production was discovered on chromosome 20 at 34.501126 Mb. Due to the high mapping precision, it was clear that many of the discovered QTL were the same across the five dairy traits. CONCLUSIONS: Across-breed Bayes GC genomic prediction improved prediction accuracies compared to GBLUP. The combination of across-breed WGS data and Bayesian genomic prediction proved remarkably effective for the fine-mapping of QTL. BioMed Central 2021-02-26 /pmc/articles/PMC7908738/ /pubmed/33637049 http://dx.doi.org/10.1186/s12711-021-00607-4 Text en © The Author(s) 2021 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle	Research Article Meuwissen, Theo van den Berg, Irene Goddard, Mike On the use of whole-genome sequence data for across-breed genomic prediction and fine-scale mapping of QTL
title	On the use of whole-genome sequence data for across-breed genomic prediction and fine-scale mapping of QTL
title_full	On the use of whole-genome sequence data for across-breed genomic prediction and fine-scale mapping of QTL
title_fullStr	On the use of whole-genome sequence data for across-breed genomic prediction and fine-scale mapping of QTL
title_full_unstemmed	On the use of whole-genome sequence data for across-breed genomic prediction and fine-scale mapping of QTL
title_short	On the use of whole-genome sequence data for across-breed genomic prediction and fine-scale mapping of QTL
title_sort	on the use of whole-genome sequence data for across-breed genomic prediction and fine-scale mapping of qtl
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7908738/ https://www.ncbi.nlm.nih.gov/pubmed/33637049 http://dx.doi.org/10.1186/s12711-021-00607-4
work_keys_str_mv	AT meuwissentheo ontheuseofwholegenomesequencedataforacrossbreedgenomicpredictionandfinescalemappingofqtl AT vandenbergirene ontheuseofwholegenomesequencedataforacrossbreedgenomicpredictionandfinescalemappingofqtl AT goddardmike ontheuseofwholegenomesequencedataforacrossbreedgenomicpredictionandfinescalemappingofqtl

On the use of whole-genome sequence data for across-breed genomic prediction and fine-scale mapping of QTL

Ejemplares similares