Cargando…
Evaluating dimensionality reduction for genomic prediction
The development of genomic selection (GS) methods has allowed plant breeding programs to select favorable lines using genomic data before performing field trials. Improvements in genotyping technology have yielded high-dimensional genomic marker data which can be difficult to incorporate into statis...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9614092/ https://www.ncbi.nlm.nih.gov/pubmed/36313472 http://dx.doi.org/10.3389/fgene.2022.958780 |
_version_ | 1784820119368630272 |
---|---|
author | Manthena, Vamsi Jarquín, Diego Varshney, Rajeev K. Roorkiwal, Manish Dixit, Girish Prasad Bharadwaj, Chellapilla Howard, Reka |
author_facet | Manthena, Vamsi Jarquín, Diego Varshney, Rajeev K. Roorkiwal, Manish Dixit, Girish Prasad Bharadwaj, Chellapilla Howard, Reka |
author_sort | Manthena, Vamsi |
collection | PubMed |
description | The development of genomic selection (GS) methods has allowed plant breeding programs to select favorable lines using genomic data before performing field trials. Improvements in genotyping technology have yielded high-dimensional genomic marker data which can be difficult to incorporate into statistical models. In this paper, we investigated the utility of applying dimensionality reduction (DR) methods as a pre-processing step for GS methods. We compared five DR methods and studied the trend in the prediction accuracies of each method as a function of the number of features retained. The effect of DR methods was studied using three models that involved the main effects of line, environment, marker, and the genotype by environment interactions. The methods were applied on a real data set containing 315 lines phenotyped in nine environments with 26,817 markers each. Regardless of the DR method and prediction model used, only a fraction of features was sufficient to achieve maximum correlation. Our results underline the usefulness of DR methods as a key pre-processing step in GS models to improve computational efficiency in the face of ever-increasing size of genomic data. |
format | Online Article Text |
id | pubmed-9614092 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-96140922022-10-29 Evaluating dimensionality reduction for genomic prediction Manthena, Vamsi Jarquín, Diego Varshney, Rajeev K. Roorkiwal, Manish Dixit, Girish Prasad Bharadwaj, Chellapilla Howard, Reka Front Genet Genetics The development of genomic selection (GS) methods has allowed plant breeding programs to select favorable lines using genomic data before performing field trials. Improvements in genotyping technology have yielded high-dimensional genomic marker data which can be difficult to incorporate into statistical models. In this paper, we investigated the utility of applying dimensionality reduction (DR) methods as a pre-processing step for GS methods. We compared five DR methods and studied the trend in the prediction accuracies of each method as a function of the number of features retained. The effect of DR methods was studied using three models that involved the main effects of line, environment, marker, and the genotype by environment interactions. The methods were applied on a real data set containing 315 lines phenotyped in nine environments with 26,817 markers each. Regardless of the DR method and prediction model used, only a fraction of features was sufficient to achieve maximum correlation. Our results underline the usefulness of DR methods as a key pre-processing step in GS models to improve computational efficiency in the face of ever-increasing size of genomic data. Frontiers Media S.A. 2022-10-14 /pmc/articles/PMC9614092/ /pubmed/36313472 http://dx.doi.org/10.3389/fgene.2022.958780 Text en Copyright © 2022 Manthena, Jarquín, Varshney, Roorkiwal, Dixit, Bharadwaj and Howard. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Manthena, Vamsi Jarquín, Diego Varshney, Rajeev K. Roorkiwal, Manish Dixit, Girish Prasad Bharadwaj, Chellapilla Howard, Reka Evaluating dimensionality reduction for genomic prediction |
title | Evaluating dimensionality reduction for genomic prediction |
title_full | Evaluating dimensionality reduction for genomic prediction |
title_fullStr | Evaluating dimensionality reduction for genomic prediction |
title_full_unstemmed | Evaluating dimensionality reduction for genomic prediction |
title_short | Evaluating dimensionality reduction for genomic prediction |
title_sort | evaluating dimensionality reduction for genomic prediction |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9614092/ https://www.ncbi.nlm.nih.gov/pubmed/36313472 http://dx.doi.org/10.3389/fgene.2022.958780 |
work_keys_str_mv | AT manthenavamsi evaluatingdimensionalityreductionforgenomicprediction AT jarquindiego evaluatingdimensionalityreductionforgenomicprediction AT varshneyrajeevk evaluatingdimensionalityreductionforgenomicprediction AT roorkiwalmanish evaluatingdimensionalityreductionforgenomicprediction AT dixitgirishprasad evaluatingdimensionalityreductionforgenomicprediction AT bharadwajchellapilla evaluatingdimensionalityreductionforgenomicprediction AT howardreka evaluatingdimensionalityreductionforgenomicprediction |