Cargando…

The Dimensionality of Genomic Information and Its Effect on Genomic Prediction

The genomic relationship matrix (GRM) can be inverted by the algorithm for proven and young (APY) based on recursion on a random subset of animals. While a regular inverse has a cubic cost, the cost of the APY inverse can be close to linear. Theory for the APY assumes that the optimal size of the su...

Descripción completa

Detalles Bibliográficos
Autores principales: Pocrnic, Ivan, Lourenco, Daniela A. L., Masuda, Yutaka, Legarra, Andres, Misztal, Ignacy
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Genetics Society of America 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4858800/
https://www.ncbi.nlm.nih.gov/pubmed/26944916
http://dx.doi.org/10.1534/genetics.116.187013
_version_ 1782430858267852800
author Pocrnic, Ivan
Lourenco, Daniela A. L.
Masuda, Yutaka
Legarra, Andres
Misztal, Ignacy
author_facet Pocrnic, Ivan
Lourenco, Daniela A. L.
Masuda, Yutaka
Legarra, Andres
Misztal, Ignacy
author_sort Pocrnic, Ivan
collection PubMed
description The genomic relationship matrix (GRM) can be inverted by the algorithm for proven and young (APY) based on recursion on a random subset of animals. While a regular inverse has a cubic cost, the cost of the APY inverse can be close to linear. Theory for the APY assumes that the optimal size of the subset (maximizing accuracy of genomic predictions) is due to a limited dimensionality of the GRM, which is a function of the effective population size (N(e)). The objective of this study was to evaluate these assumptions by simulation. Six populations were simulated with approximate effective population size (N(e)) from 20 to 200. Each population consisted of 10 nonoverlapping generations, with 25,000 animals per generation and phenotypes available for generations 1–9. The last 3 generations were fully genotyped assuming genome length L = 30. The GRM was constructed for each population and analyzed for distribution of eigenvalues. Genomic estimated breeding values (GEBV) were computed by single-step GBLUP, using either a direct or an APY inverse of GRM. The sizes of the subset in APY were set to the number of the largest eigenvalues explaining x% of variation (EIGx, x = 90, 95, 98, 99) in GRM. Accuracies of GEBV for the last generation with the APY inverse peaked at EIG98 and were slightly lower with EIG95, EIG99, or the direct inverse. Most information in the GRM is contained in ∼N(e)L largest eigenvalues, with no information beyond 4N(e)L. Genomic predictions with the APY inverse of the GRM are more accurate than by the regular inverse.
format Online
Article
Text
id pubmed-4858800
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Genetics Society of America
record_format MEDLINE/PubMed
spelling pubmed-48588002016-05-12 The Dimensionality of Genomic Information and Its Effect on Genomic Prediction Pocrnic, Ivan Lourenco, Daniela A. L. Masuda, Yutaka Legarra, Andres Misztal, Ignacy Genetics Investigations The genomic relationship matrix (GRM) can be inverted by the algorithm for proven and young (APY) based on recursion on a random subset of animals. While a regular inverse has a cubic cost, the cost of the APY inverse can be close to linear. Theory for the APY assumes that the optimal size of the subset (maximizing accuracy of genomic predictions) is due to a limited dimensionality of the GRM, which is a function of the effective population size (N(e)). The objective of this study was to evaluate these assumptions by simulation. Six populations were simulated with approximate effective population size (N(e)) from 20 to 200. Each population consisted of 10 nonoverlapping generations, with 25,000 animals per generation and phenotypes available for generations 1–9. The last 3 generations were fully genotyped assuming genome length L = 30. The GRM was constructed for each population and analyzed for distribution of eigenvalues. Genomic estimated breeding values (GEBV) were computed by single-step GBLUP, using either a direct or an APY inverse of GRM. The sizes of the subset in APY were set to the number of the largest eigenvalues explaining x% of variation (EIGx, x = 90, 95, 98, 99) in GRM. Accuracies of GEBV for the last generation with the APY inverse peaked at EIG98 and were slightly lower with EIG95, EIG99, or the direct inverse. Most information in the GRM is contained in ∼N(e)L largest eigenvalues, with no information beyond 4N(e)L. Genomic predictions with the APY inverse of the GRM are more accurate than by the regular inverse. Genetics Society of America 2016-05 2016-03-04 /pmc/articles/PMC4858800/ /pubmed/26944916 http://dx.doi.org/10.1534/genetics.116.187013 Text en Copyright © 2016 by the Genetics Society of America Available freely online through the author-supported open access option.
spellingShingle Investigations
Pocrnic, Ivan
Lourenco, Daniela A. L.
Masuda, Yutaka
Legarra, Andres
Misztal, Ignacy
The Dimensionality of Genomic Information and Its Effect on Genomic Prediction
title The Dimensionality of Genomic Information and Its Effect on Genomic Prediction
title_full The Dimensionality of Genomic Information and Its Effect on Genomic Prediction
title_fullStr The Dimensionality of Genomic Information and Its Effect on Genomic Prediction
title_full_unstemmed The Dimensionality of Genomic Information and Its Effect on Genomic Prediction
title_short The Dimensionality of Genomic Information and Its Effect on Genomic Prediction
title_sort dimensionality of genomic information and its effect on genomic prediction
topic Investigations
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4858800/
https://www.ncbi.nlm.nih.gov/pubmed/26944916
http://dx.doi.org/10.1534/genetics.116.187013
work_keys_str_mv AT pocrnicivan thedimensionalityofgenomicinformationanditseffectongenomicprediction
AT lourencodanielaal thedimensionalityofgenomicinformationanditseffectongenomicprediction
AT masudayutaka thedimensionalityofgenomicinformationanditseffectongenomicprediction
AT legarraandres thedimensionalityofgenomicinformationanditseffectongenomicprediction
AT misztalignacy thedimensionalityofgenomicinformationanditseffectongenomicprediction
AT pocrnicivan dimensionalityofgenomicinformationanditseffectongenomicprediction
AT lourencodanielaal dimensionalityofgenomicinformationanditseffectongenomicprediction
AT masudayutaka dimensionalityofgenomicinformationanditseffectongenomicprediction
AT legarraandres dimensionalityofgenomicinformationanditseffectongenomicprediction
AT misztalignacy dimensionalityofgenomicinformationanditseffectongenomicprediction