Cargando…

Evaluating dimensionality reduction for genomic prediction

The development of genomic selection (GS) methods has allowed plant breeding programs to select favorable lines using genomic data before performing field trials. Improvements in genotyping technology have yielded high-dimensional genomic marker data which can be difficult to incorporate into statis...

Descripción completa

Detalles Bibliográficos
Autores principales: Manthena, Vamsi, Jarquín, Diego, Varshney, Rajeev K., Roorkiwal, Manish, Dixit, Girish Prasad, Bharadwaj, Chellapilla, Howard, Reka
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9614092/
https://www.ncbi.nlm.nih.gov/pubmed/36313472
http://dx.doi.org/10.3389/fgene.2022.958780
_version_ 1784820119368630272
author Manthena, Vamsi
Jarquín, Diego
Varshney, Rajeev K.
Roorkiwal, Manish
Dixit, Girish Prasad
Bharadwaj, Chellapilla
Howard, Reka
author_facet Manthena, Vamsi
Jarquín, Diego
Varshney, Rajeev K.
Roorkiwal, Manish
Dixit, Girish Prasad
Bharadwaj, Chellapilla
Howard, Reka
author_sort Manthena, Vamsi
collection PubMed
description The development of genomic selection (GS) methods has allowed plant breeding programs to select favorable lines using genomic data before performing field trials. Improvements in genotyping technology have yielded high-dimensional genomic marker data which can be difficult to incorporate into statistical models. In this paper, we investigated the utility of applying dimensionality reduction (DR) methods as a pre-processing step for GS methods. We compared five DR methods and studied the trend in the prediction accuracies of each method as a function of the number of features retained. The effect of DR methods was studied using three models that involved the main effects of line, environment, marker, and the genotype by environment interactions. The methods were applied on a real data set containing 315 lines phenotyped in nine environments with 26,817 markers each. Regardless of the DR method and prediction model used, only a fraction of features was sufficient to achieve maximum correlation. Our results underline the usefulness of DR methods as a key pre-processing step in GS models to improve computational efficiency in the face of ever-increasing size of genomic data.
format Online
Article
Text
id pubmed-9614092
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-96140922022-10-29 Evaluating dimensionality reduction for genomic prediction Manthena, Vamsi Jarquín, Diego Varshney, Rajeev K. Roorkiwal, Manish Dixit, Girish Prasad Bharadwaj, Chellapilla Howard, Reka Front Genet Genetics The development of genomic selection (GS) methods has allowed plant breeding programs to select favorable lines using genomic data before performing field trials. Improvements in genotyping technology have yielded high-dimensional genomic marker data which can be difficult to incorporate into statistical models. In this paper, we investigated the utility of applying dimensionality reduction (DR) methods as a pre-processing step for GS methods. We compared five DR methods and studied the trend in the prediction accuracies of each method as a function of the number of features retained. The effect of DR methods was studied using three models that involved the main effects of line, environment, marker, and the genotype by environment interactions. The methods were applied on a real data set containing 315 lines phenotyped in nine environments with 26,817 markers each. Regardless of the DR method and prediction model used, only a fraction of features was sufficient to achieve maximum correlation. Our results underline the usefulness of DR methods as a key pre-processing step in GS models to improve computational efficiency in the face of ever-increasing size of genomic data. Frontiers Media S.A. 2022-10-14 /pmc/articles/PMC9614092/ /pubmed/36313472 http://dx.doi.org/10.3389/fgene.2022.958780 Text en Copyright © 2022 Manthena, Jarquín, Varshney, Roorkiwal, Dixit, Bharadwaj and Howard. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Manthena, Vamsi
Jarquín, Diego
Varshney, Rajeev K.
Roorkiwal, Manish
Dixit, Girish Prasad
Bharadwaj, Chellapilla
Howard, Reka
Evaluating dimensionality reduction for genomic prediction
title Evaluating dimensionality reduction for genomic prediction
title_full Evaluating dimensionality reduction for genomic prediction
title_fullStr Evaluating dimensionality reduction for genomic prediction
title_full_unstemmed Evaluating dimensionality reduction for genomic prediction
title_short Evaluating dimensionality reduction for genomic prediction
title_sort evaluating dimensionality reduction for genomic prediction
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9614092/
https://www.ncbi.nlm.nih.gov/pubmed/36313472
http://dx.doi.org/10.3389/fgene.2022.958780
work_keys_str_mv AT manthenavamsi evaluatingdimensionalityreductionforgenomicprediction
AT jarquindiego evaluatingdimensionalityreductionforgenomicprediction
AT varshneyrajeevk evaluatingdimensionalityreductionforgenomicprediction
AT roorkiwalmanish evaluatingdimensionalityreductionforgenomicprediction
AT dixitgirishprasad evaluatingdimensionalityreductionforgenomicprediction
AT bharadwajchellapilla evaluatingdimensionalityreductionforgenomicprediction
AT howardreka evaluatingdimensionalityreductionforgenomicprediction