Cargando…

Predictions from algorithmic modeling result in better decisions than from data modeling for soybean iron deficiency chlorosis

In soybean variety development and genetic improvement projects, iron deficiency chlorosis (IDC) is visually assessed as an ordinal response variable. Linear Mixed Models for Genomic Prediction (GP) have been developed, compared, and used to select continuous plant traits such as yield, height, and...

Descripción completa

Detalles Bibliográficos
Autores principales:	Xu, Zhanyou, Kurek, Andreomar, Cannon, Steven B., Beavis, William D.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2021
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8270216/ https://www.ncbi.nlm.nih.gov/pubmed/34242220 http://dx.doi.org/10.1371/journal.pone.0240948

_version_	1783720757256257536
author	Xu, Zhanyou Kurek, Andreomar Cannon, Steven B. Beavis, William D.
author_facet	Xu, Zhanyou Kurek, Andreomar Cannon, Steven B. Beavis, William D.
author_sort	Xu, Zhanyou
collection	PubMed
description	In soybean variety development and genetic improvement projects, iron deficiency chlorosis (IDC) is visually assessed as an ordinal response variable. Linear Mixed Models for Genomic Prediction (GP) have been developed, compared, and used to select continuous plant traits such as yield, height, and maturity, but can be inappropriate for ordinal traits. Generalized Linear Mixed Models have been developed for GP of ordinal response variables. However, neither approach addresses the most important questions for cultivar development and genetic improvement: How frequently are the ‘wrong’ genotypes retained, and how often are the ‘correct’ genotypes discarded? The research objective reported herein was to compare outcomes from four data modeling and six algorithmic modeling GP methods applied to IDC using decision metrics appropriate for variety development and genetic improvement projects. Appropriate metrics for decision making consist of specificity, sensitivity, precision, decision accuracy, and area under the receiver operating characteristic curve. Data modeling methods for GP included ridge regression, logistic regression, penalized logistic regression, and Bayesian generalized linear regression. Algorithmic modeling methods include Random Forest, Gradient Boosting Machine, Support Vector Machine, K-Nearest Neighbors, Naïve Bayes, and Artificial Neural Network. We found that a Support Vector Machine model provided the most specific decisions of correctly discarding IDC susceptible genotypes, while a Random Forest model resulted in the best decisions of retaining IDC tolerant genotypes, as well as the best outcomes when considering all decision metrics. Overall, the predictions from algorithmic modeling result in better decisions than from data modeling methods applied to soybean IDC.
format	Online Article Text
id	pubmed-8270216
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-82702162021-07-21 Predictions from algorithmic modeling result in better decisions than from data modeling for soybean iron deficiency chlorosis Xu, Zhanyou Kurek, Andreomar Cannon, Steven B. Beavis, William D. PLoS One Research Article In soybean variety development and genetic improvement projects, iron deficiency chlorosis (IDC) is visually assessed as an ordinal response variable. Linear Mixed Models for Genomic Prediction (GP) have been developed, compared, and used to select continuous plant traits such as yield, height, and maturity, but can be inappropriate for ordinal traits. Generalized Linear Mixed Models have been developed for GP of ordinal response variables. However, neither approach addresses the most important questions for cultivar development and genetic improvement: How frequently are the ‘wrong’ genotypes retained, and how often are the ‘correct’ genotypes discarded? The research objective reported herein was to compare outcomes from four data modeling and six algorithmic modeling GP methods applied to IDC using decision metrics appropriate for variety development and genetic improvement projects. Appropriate metrics for decision making consist of specificity, sensitivity, precision, decision accuracy, and area under the receiver operating characteristic curve. Data modeling methods for GP included ridge regression, logistic regression, penalized logistic regression, and Bayesian generalized linear regression. Algorithmic modeling methods include Random Forest, Gradient Boosting Machine, Support Vector Machine, K-Nearest Neighbors, Naïve Bayes, and Artificial Neural Network. We found that a Support Vector Machine model provided the most specific decisions of correctly discarding IDC susceptible genotypes, while a Random Forest model resulted in the best decisions of retaining IDC tolerant genotypes, as well as the best outcomes when considering all decision metrics. Overall, the predictions from algorithmic modeling result in better decisions than from data modeling methods applied to soybean IDC. Public Library of Science 2021-07-09 /pmc/articles/PMC8270216/ /pubmed/34242220 http://dx.doi.org/10.1371/journal.pone.0240948 Text en https://creativecommons.org/publicdomain/zero/1.0/This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 (https://creativecommons.org/publicdomain/zero/1.0/) public domain dedication.
spellingShingle	Research Article Xu, Zhanyou Kurek, Andreomar Cannon, Steven B. Beavis, William D. Predictions from algorithmic modeling result in better decisions than from data modeling for soybean iron deficiency chlorosis
title	Predictions from algorithmic modeling result in better decisions than from data modeling for soybean iron deficiency chlorosis
title_full	Predictions from algorithmic modeling result in better decisions than from data modeling for soybean iron deficiency chlorosis
title_fullStr	Predictions from algorithmic modeling result in better decisions than from data modeling for soybean iron deficiency chlorosis
title_full_unstemmed	Predictions from algorithmic modeling result in better decisions than from data modeling for soybean iron deficiency chlorosis
title_short	Predictions from algorithmic modeling result in better decisions than from data modeling for soybean iron deficiency chlorosis
title_sort	predictions from algorithmic modeling result in better decisions than from data modeling for soybean iron deficiency chlorosis
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8270216/ https://www.ncbi.nlm.nih.gov/pubmed/34242220 http://dx.doi.org/10.1371/journal.pone.0240948
work_keys_str_mv	AT xuzhanyou predictionsfromalgorithmicmodelingresultinbetterdecisionsthanfromdatamodelingforsoybeanirondeficiencychlorosis AT kurekandreomar predictionsfromalgorithmicmodelingresultinbetterdecisionsthanfromdatamodelingforsoybeanirondeficiencychlorosis AT cannonstevenb predictionsfromalgorithmicmodelingresultinbetterdecisionsthanfromdatamodelingforsoybeanirondeficiencychlorosis AT beaviswilliamd predictionsfromalgorithmicmodelingresultinbetterdecisionsthanfromdatamodelingforsoybeanirondeficiencychlorosis

Predictions from algorithmic modeling result in better decisions than from data modeling for soybean iron deficiency chlorosis

Ejemplares similares