Cargando…

Large-scale genomic prediction using singular value decomposition of the genotype matrix

BACKGROUND: For marker effect models and genomic animal models, computational requirements increase with the number of loci and the number of genotyped individuals, respectively. In the latter case, the inverse genomic relationship matrix (GRM) is typically needed, which is computationally demanding...

Descripción completa

Detalles Bibliográficos
Autores principales:	Ødegård, Jørgen, Indahl, Ulf, Strandén, Ismo, Meuwissen, Theo H. E.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2018
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5831701/ https://www.ncbi.nlm.nih.gov/pubmed/29490611 http://dx.doi.org/10.1186/s12711-018-0373-2

_version_	1783303185083924480
author	Ødegård, Jørgen Indahl, Ulf Strandén, Ismo Meuwissen, Theo H. E.
author_facet	Ødegård, Jørgen Indahl, Ulf Strandén, Ismo Meuwissen, Theo H. E.
author_sort	Ødegård, Jørgen
collection	PubMed
description	BACKGROUND: For marker effect models and genomic animal models, computational requirements increase with the number of loci and the number of genotyped individuals, respectively. In the latter case, the inverse genomic relationship matrix (GRM) is typically needed, which is computationally demanding to compute for large datasets. Thus, there is a great need for dimensionality-reduction methods that can analyze massive genomic data. For this purpose, we developed reduced-dimension singular value decomposition (SVD) based models for genomic prediction. METHODS: Fast SVD is performed by analyzing different chromosomes/genome segments in parallel and/or by restricting SVD to a limited core of genotyped individuals, producing chromosome- or segment-specific principal components (PC). Given a limited effective population size, nearly all the genetic variation can be effectively captured by a limited number of PC. Genomic prediction can then be performed either by PC ridge regression (PCRR) or by genomic animal models using an inverse GRM computed from the chosen PC (PCIG). In the latter case, computation of the inverse GRM will be feasible for any number of genotyped individuals and can be readily produced row- or element-wise. RESULTS: Using simulated data, we show that PCRR and PCIG models, using chromosome-wise SVD of a core sample of individuals, are appropriate for genomic prediction in a larger population, and results in virtually identical predicted breeding values as the original full-dimension genomic model (r = 1.000). Compared with other algorithms (e.g. algorithm for proven and young animals, APY), the (chromosome-wise SVD-based) PCRR and PCIG models were more robust to size of the core sample, giving nearly identical results even down to 500 core individuals. The method was also successfully tested on a large multi-breed dataset. CONCLUSIONS: SVD can be used for dimensionality reduction of large genomic datasets. After SVD, genomic prediction using dense genomic data and many genotyped individuals can be done in a computationally efficient manner. Using this method, the resulting genomic estimated breeding values were virtually identical to those computed from a full-dimension genomic model. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12711-018-0373-2) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-5831701
institution	National Center for Biotechnology Information
language	English
publishDate	2018
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-58317012018-03-05 Large-scale genomic prediction using singular value decomposition of the genotype matrix Ødegård, Jørgen Indahl, Ulf Strandén, Ismo Meuwissen, Theo H. E. Genet Sel Evol Research Article BACKGROUND: For marker effect models and genomic animal models, computational requirements increase with the number of loci and the number of genotyped individuals, respectively. In the latter case, the inverse genomic relationship matrix (GRM) is typically needed, which is computationally demanding to compute for large datasets. Thus, there is a great need for dimensionality-reduction methods that can analyze massive genomic data. For this purpose, we developed reduced-dimension singular value decomposition (SVD) based models for genomic prediction. METHODS: Fast SVD is performed by analyzing different chromosomes/genome segments in parallel and/or by restricting SVD to a limited core of genotyped individuals, producing chromosome- or segment-specific principal components (PC). Given a limited effective population size, nearly all the genetic variation can be effectively captured by a limited number of PC. Genomic prediction can then be performed either by PC ridge regression (PCRR) or by genomic animal models using an inverse GRM computed from the chosen PC (PCIG). In the latter case, computation of the inverse GRM will be feasible for any number of genotyped individuals and can be readily produced row- or element-wise. RESULTS: Using simulated data, we show that PCRR and PCIG models, using chromosome-wise SVD of a core sample of individuals, are appropriate for genomic prediction in a larger population, and results in virtually identical predicted breeding values as the original full-dimension genomic model (r = 1.000). Compared with other algorithms (e.g. algorithm for proven and young animals, APY), the (chromosome-wise SVD-based) PCRR and PCIG models were more robust to size of the core sample, giving nearly identical results even down to 500 core individuals. The method was also successfully tested on a large multi-breed dataset. CONCLUSIONS: SVD can be used for dimensionality reduction of large genomic datasets. After SVD, genomic prediction using dense genomic data and many genotyped individuals can be done in a computationally efficient manner. Using this method, the resulting genomic estimated breeding values were virtually identical to those computed from a full-dimension genomic model. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12711-018-0373-2) contains supplementary material, which is available to authorized users. BioMed Central 2018-02-28 /pmc/articles/PMC5831701/ /pubmed/29490611 http://dx.doi.org/10.1186/s12711-018-0373-2 Text en © The Author(s) 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Article Ødegård, Jørgen Indahl, Ulf Strandén, Ismo Meuwissen, Theo H. E. Large-scale genomic prediction using singular value decomposition of the genotype matrix
title	Large-scale genomic prediction using singular value decomposition of the genotype matrix
title_full	Large-scale genomic prediction using singular value decomposition of the genotype matrix
title_fullStr	Large-scale genomic prediction using singular value decomposition of the genotype matrix
title_full_unstemmed	Large-scale genomic prediction using singular value decomposition of the genotype matrix
title_short	Large-scale genomic prediction using singular value decomposition of the genotype matrix
title_sort	large-scale genomic prediction using singular value decomposition of the genotype matrix
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5831701/ https://www.ncbi.nlm.nih.gov/pubmed/29490611 http://dx.doi.org/10.1186/s12711-018-0373-2
work_keys_str_mv	AT ødegardjørgen largescalegenomicpredictionusingsingularvaluedecompositionofthegenotypematrix AT indahlulf largescalegenomicpredictionusingsingularvaluedecompositionofthegenotypematrix AT strandenismo largescalegenomicpredictionusingsingularvaluedecompositionofthegenotypematrix AT meuwissentheohe largescalegenomicpredictionusingsingularvaluedecompositionofthegenotypematrix

Large-scale genomic prediction using singular value decomposition of the genotype matrix

Ejemplares similares