Cargando…

Machine Learning Applied to the Search for Nonlinear Features in Breeding Populations

Large plant breeding populations are traditionally a source of novel allelic diversity and are at the core of selection efforts for elite material. Finding rare diversity requires a deep understanding of biological interactions between the genetic makeup of one genotype and its environmental conditi...

Descripción completa

Detalles Bibliográficos
Autores principales: Gabur, Iulian, Simioniuc, Danut Petru, Snowdon, Rod J., Cristea, Dan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9164111/
https://www.ncbi.nlm.nih.gov/pubmed/35669178
http://dx.doi.org/10.3389/frai.2022.876578
_version_ 1784720067267657728
author Gabur, Iulian
Simioniuc, Danut Petru
Snowdon, Rod J.
Cristea, Dan
author_facet Gabur, Iulian
Simioniuc, Danut Petru
Snowdon, Rod J.
Cristea, Dan
author_sort Gabur, Iulian
collection PubMed
description Large plant breeding populations are traditionally a source of novel allelic diversity and are at the core of selection efforts for elite material. Finding rare diversity requires a deep understanding of biological interactions between the genetic makeup of one genotype and its environmental conditions. Most modern breeding programs still rely on linear regression models to solve this problem, generalizing the complex genotype by phenotype interactions through manually constructed linear features. However, the identification of positive alleles vs. background can be addressed using deep learning approaches that have the capacity to learn complex nonlinear functions for the inputs. Machine learning (ML) is an artificial intelligence (AI) approach involving a range of algorithms to learn from input data sets and predict outcomes in other related samples. This paper describes a variety of techniques that include supervised and unsupervised ML algorithms to improve our understanding of nonlinear interactions from plant breeding data sets. Feature selection (FS) methods are combined with linear and nonlinear predictors and compared to traditional prediction methods used in plant breeding. Recent advances in ML allowed the construction of complex models that have the capacity to better differentiate between positive alleles and the genetic background. Using real plant breeding program data, we show that ML methods have the ability to outperform current approaches, increase prediction accuracies, decrease the computing time drastically, and improve the detection of important alleles involved in qualitative or quantitative traits.
format Online
Article
Text
id pubmed-9164111
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-91641112022-06-05 Machine Learning Applied to the Search for Nonlinear Features in Breeding Populations Gabur, Iulian Simioniuc, Danut Petru Snowdon, Rod J. Cristea, Dan Front Artif Intell Artificial Intelligence Large plant breeding populations are traditionally a source of novel allelic diversity and are at the core of selection efforts for elite material. Finding rare diversity requires a deep understanding of biological interactions between the genetic makeup of one genotype and its environmental conditions. Most modern breeding programs still rely on linear regression models to solve this problem, generalizing the complex genotype by phenotype interactions through manually constructed linear features. However, the identification of positive alleles vs. background can be addressed using deep learning approaches that have the capacity to learn complex nonlinear functions for the inputs. Machine learning (ML) is an artificial intelligence (AI) approach involving a range of algorithms to learn from input data sets and predict outcomes in other related samples. This paper describes a variety of techniques that include supervised and unsupervised ML algorithms to improve our understanding of nonlinear interactions from plant breeding data sets. Feature selection (FS) methods are combined with linear and nonlinear predictors and compared to traditional prediction methods used in plant breeding. Recent advances in ML allowed the construction of complex models that have the capacity to better differentiate between positive alleles and the genetic background. Using real plant breeding program data, we show that ML methods have the ability to outperform current approaches, increase prediction accuracies, decrease the computing time drastically, and improve the detection of important alleles involved in qualitative or quantitative traits. Frontiers Media S.A. 2022-05-20 /pmc/articles/PMC9164111/ /pubmed/35669178 http://dx.doi.org/10.3389/frai.2022.876578 Text en Copyright © 2022 Gabur, Simioniuc, Snowdon and Cristea. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Artificial Intelligence
Gabur, Iulian
Simioniuc, Danut Petru
Snowdon, Rod J.
Cristea, Dan
Machine Learning Applied to the Search for Nonlinear Features in Breeding Populations
title Machine Learning Applied to the Search for Nonlinear Features in Breeding Populations
title_full Machine Learning Applied to the Search for Nonlinear Features in Breeding Populations
title_fullStr Machine Learning Applied to the Search for Nonlinear Features in Breeding Populations
title_full_unstemmed Machine Learning Applied to the Search for Nonlinear Features in Breeding Populations
title_short Machine Learning Applied to the Search for Nonlinear Features in Breeding Populations
title_sort machine learning applied to the search for nonlinear features in breeding populations
topic Artificial Intelligence
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9164111/
https://www.ncbi.nlm.nih.gov/pubmed/35669178
http://dx.doi.org/10.3389/frai.2022.876578
work_keys_str_mv AT gaburiulian machinelearningappliedtothesearchfornonlinearfeaturesinbreedingpopulations
AT simioniucdanutpetru machinelearningappliedtothesearchfornonlinearfeaturesinbreedingpopulations
AT snowdonrodj machinelearningappliedtothesearchfornonlinearfeaturesinbreedingpopulations
AT cristeadan machinelearningappliedtothesearchfornonlinearfeaturesinbreedingpopulations