Cargando…
The Dimensionality of Genomic Information and Its Effect on Genomic Prediction
The genomic relationship matrix (GRM) can be inverted by the algorithm for proven and young (APY) based on recursion on a random subset of animals. While a regular inverse has a cubic cost, the cost of the APY inverse can be close to linear. Theory for the APY assumes that the optimal size of the su...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Genetics Society of America
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4858800/ https://www.ncbi.nlm.nih.gov/pubmed/26944916 http://dx.doi.org/10.1534/genetics.116.187013 |
_version_ | 1782430858267852800 |
---|---|
author | Pocrnic, Ivan Lourenco, Daniela A. L. Masuda, Yutaka Legarra, Andres Misztal, Ignacy |
author_facet | Pocrnic, Ivan Lourenco, Daniela A. L. Masuda, Yutaka Legarra, Andres Misztal, Ignacy |
author_sort | Pocrnic, Ivan |
collection | PubMed |
description | The genomic relationship matrix (GRM) can be inverted by the algorithm for proven and young (APY) based on recursion on a random subset of animals. While a regular inverse has a cubic cost, the cost of the APY inverse can be close to linear. Theory for the APY assumes that the optimal size of the subset (maximizing accuracy of genomic predictions) is due to a limited dimensionality of the GRM, which is a function of the effective population size (N(e)). The objective of this study was to evaluate these assumptions by simulation. Six populations were simulated with approximate effective population size (N(e)) from 20 to 200. Each population consisted of 10 nonoverlapping generations, with 25,000 animals per generation and phenotypes available for generations 1–9. The last 3 generations were fully genotyped assuming genome length L = 30. The GRM was constructed for each population and analyzed for distribution of eigenvalues. Genomic estimated breeding values (GEBV) were computed by single-step GBLUP, using either a direct or an APY inverse of GRM. The sizes of the subset in APY were set to the number of the largest eigenvalues explaining x% of variation (EIGx, x = 90, 95, 98, 99) in GRM. Accuracies of GEBV for the last generation with the APY inverse peaked at EIG98 and were slightly lower with EIG95, EIG99, or the direct inverse. Most information in the GRM is contained in ∼N(e)L largest eigenvalues, with no information beyond 4N(e)L. Genomic predictions with the APY inverse of the GRM are more accurate than by the regular inverse. |
format | Online Article Text |
id | pubmed-4858800 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Genetics Society of America |
record_format | MEDLINE/PubMed |
spelling | pubmed-48588002016-05-12 The Dimensionality of Genomic Information and Its Effect on Genomic Prediction Pocrnic, Ivan Lourenco, Daniela A. L. Masuda, Yutaka Legarra, Andres Misztal, Ignacy Genetics Investigations The genomic relationship matrix (GRM) can be inverted by the algorithm for proven and young (APY) based on recursion on a random subset of animals. While a regular inverse has a cubic cost, the cost of the APY inverse can be close to linear. Theory for the APY assumes that the optimal size of the subset (maximizing accuracy of genomic predictions) is due to a limited dimensionality of the GRM, which is a function of the effective population size (N(e)). The objective of this study was to evaluate these assumptions by simulation. Six populations were simulated with approximate effective population size (N(e)) from 20 to 200. Each population consisted of 10 nonoverlapping generations, with 25,000 animals per generation and phenotypes available for generations 1–9. The last 3 generations were fully genotyped assuming genome length L = 30. The GRM was constructed for each population and analyzed for distribution of eigenvalues. Genomic estimated breeding values (GEBV) were computed by single-step GBLUP, using either a direct or an APY inverse of GRM. The sizes of the subset in APY were set to the number of the largest eigenvalues explaining x% of variation (EIGx, x = 90, 95, 98, 99) in GRM. Accuracies of GEBV for the last generation with the APY inverse peaked at EIG98 and were slightly lower with EIG95, EIG99, or the direct inverse. Most information in the GRM is contained in ∼N(e)L largest eigenvalues, with no information beyond 4N(e)L. Genomic predictions with the APY inverse of the GRM are more accurate than by the regular inverse. Genetics Society of America 2016-05 2016-03-04 /pmc/articles/PMC4858800/ /pubmed/26944916 http://dx.doi.org/10.1534/genetics.116.187013 Text en Copyright © 2016 by the Genetics Society of America Available freely online through the author-supported open access option. |
spellingShingle | Investigations Pocrnic, Ivan Lourenco, Daniela A. L. Masuda, Yutaka Legarra, Andres Misztal, Ignacy The Dimensionality of Genomic Information and Its Effect on Genomic Prediction |
title | The Dimensionality of Genomic Information and Its Effect on Genomic Prediction |
title_full | The Dimensionality of Genomic Information and Its Effect on Genomic Prediction |
title_fullStr | The Dimensionality of Genomic Information and Its Effect on Genomic Prediction |
title_full_unstemmed | The Dimensionality of Genomic Information and Its Effect on Genomic Prediction |
title_short | The Dimensionality of Genomic Information and Its Effect on Genomic Prediction |
title_sort | dimensionality of genomic information and its effect on genomic prediction |
topic | Investigations |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4858800/ https://www.ncbi.nlm.nih.gov/pubmed/26944916 http://dx.doi.org/10.1534/genetics.116.187013 |
work_keys_str_mv | AT pocrnicivan thedimensionalityofgenomicinformationanditseffectongenomicprediction AT lourencodanielaal thedimensionalityofgenomicinformationanditseffectongenomicprediction AT masudayutaka thedimensionalityofgenomicinformationanditseffectongenomicprediction AT legarraandres thedimensionalityofgenomicinformationanditseffectongenomicprediction AT misztalignacy thedimensionalityofgenomicinformationanditseffectongenomicprediction AT pocrnicivan dimensionalityofgenomicinformationanditseffectongenomicprediction AT lourencodanielaal dimensionalityofgenomicinformationanditseffectongenomicprediction AT masudayutaka dimensionalityofgenomicinformationanditseffectongenomicprediction AT legarraandres dimensionalityofgenomicinformationanditseffectongenomicprediction AT misztalignacy dimensionalityofgenomicinformationanditseffectongenomicprediction |