Cargando…

Multi-modal deep learning improves grain yield prediction in wheat breeding by fusing genomics and phenomics

MOTIVATION: Developing new crop varieties with superior performance is highly important to ensure robust and sustainable global food security. The speed of variety development is limited by long field cycles and advanced generation selections in plant breeding programs. While methods to predict yiel...

Descripción completa

Detalles Bibliográficos
Autores principales: Togninalli, Matteo, Wang, Xu, Kucera, Tim, Shrestha, Sandesh, Juliana, Philomin, Mondal, Suchismita, Pinto, Francisco, Govindan, Velu, Crespo-Herrera, Leonardo, Huerta-Espino, Julio, Singh, Ravi P, Borgwardt, Karsten, Poland, Jesse
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10246581/
https://www.ncbi.nlm.nih.gov/pubmed/37220903
http://dx.doi.org/10.1093/bioinformatics/btad336
_version_ 1785055058763710464
author Togninalli, Matteo
Wang, Xu
Kucera, Tim
Shrestha, Sandesh
Juliana, Philomin
Mondal, Suchismita
Pinto, Francisco
Govindan, Velu
Crespo-Herrera, Leonardo
Huerta-Espino, Julio
Singh, Ravi P
Borgwardt, Karsten
Poland, Jesse
author_facet Togninalli, Matteo
Wang, Xu
Kucera, Tim
Shrestha, Sandesh
Juliana, Philomin
Mondal, Suchismita
Pinto, Francisco
Govindan, Velu
Crespo-Herrera, Leonardo
Huerta-Espino, Julio
Singh, Ravi P
Borgwardt, Karsten
Poland, Jesse
author_sort Togninalli, Matteo
collection PubMed
description MOTIVATION: Developing new crop varieties with superior performance is highly important to ensure robust and sustainable global food security. The speed of variety development is limited by long field cycles and advanced generation selections in plant breeding programs. While methods to predict yield from genotype or phenotype data have been proposed, improved performance and integrated models are needed. RESULTS: We propose a machine learning model that leverages both genotype and phenotype measurements by fusing genetic variants with multiple data sources collected by unmanned aerial systems. We use a deep multiple instance learning framework with an attention mechanism that sheds light on the importance given to each input during prediction, enhancing interpretability. Our model reaches 0.754 ± 0.024 Pearson correlation coefficient when predicting yield in similar environmental conditions; a 34.8% improvement over the genotype-only linear baseline (0.559 ± 0.050). We further predict yield on new lines in an unseen environment using only genotypes, obtaining a prediction accuracy of 0.386 ± 0.010, a 13.5% improvement over the linear baseline. Our multi-modal deep learning architecture efficiently accounts for plant health and environment, distilling the genetic contribution and providing excellent predictions. Yield prediction algorithms leveraging phenotypic observations during training therefore promise to improve breeding programs, ultimately speeding up delivery of improved varieties. AVAILABILITY AND IMPLEMENTATION: Available at https://github.com/BorgwardtLab/PheGeMIL (code) and https://doi.org/doi:10.5061/dryad.kprr4xh5p (data).
format Online
Article
Text
id pubmed-10246581
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-102465812023-06-08 Multi-modal deep learning improves grain yield prediction in wheat breeding by fusing genomics and phenomics Togninalli, Matteo Wang, Xu Kucera, Tim Shrestha, Sandesh Juliana, Philomin Mondal, Suchismita Pinto, Francisco Govindan, Velu Crespo-Herrera, Leonardo Huerta-Espino, Julio Singh, Ravi P Borgwardt, Karsten Poland, Jesse Bioinformatics Original Paper MOTIVATION: Developing new crop varieties with superior performance is highly important to ensure robust and sustainable global food security. The speed of variety development is limited by long field cycles and advanced generation selections in plant breeding programs. While methods to predict yield from genotype or phenotype data have been proposed, improved performance and integrated models are needed. RESULTS: We propose a machine learning model that leverages both genotype and phenotype measurements by fusing genetic variants with multiple data sources collected by unmanned aerial systems. We use a deep multiple instance learning framework with an attention mechanism that sheds light on the importance given to each input during prediction, enhancing interpretability. Our model reaches 0.754 ± 0.024 Pearson correlation coefficient when predicting yield in similar environmental conditions; a 34.8% improvement over the genotype-only linear baseline (0.559 ± 0.050). We further predict yield on new lines in an unseen environment using only genotypes, obtaining a prediction accuracy of 0.386 ± 0.010, a 13.5% improvement over the linear baseline. Our multi-modal deep learning architecture efficiently accounts for plant health and environment, distilling the genetic contribution and providing excellent predictions. Yield prediction algorithms leveraging phenotypic observations during training therefore promise to improve breeding programs, ultimately speeding up delivery of improved varieties. AVAILABILITY AND IMPLEMENTATION: Available at https://github.com/BorgwardtLab/PheGeMIL (code) and https://doi.org/doi:10.5061/dryad.kprr4xh5p (data). Oxford University Press 2023-05-23 /pmc/articles/PMC10246581/ /pubmed/37220903 http://dx.doi.org/10.1093/bioinformatics/btad336 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Togninalli, Matteo
Wang, Xu
Kucera, Tim
Shrestha, Sandesh
Juliana, Philomin
Mondal, Suchismita
Pinto, Francisco
Govindan, Velu
Crespo-Herrera, Leonardo
Huerta-Espino, Julio
Singh, Ravi P
Borgwardt, Karsten
Poland, Jesse
Multi-modal deep learning improves grain yield prediction in wheat breeding by fusing genomics and phenomics
title Multi-modal deep learning improves grain yield prediction in wheat breeding by fusing genomics and phenomics
title_full Multi-modal deep learning improves grain yield prediction in wheat breeding by fusing genomics and phenomics
title_fullStr Multi-modal deep learning improves grain yield prediction in wheat breeding by fusing genomics and phenomics
title_full_unstemmed Multi-modal deep learning improves grain yield prediction in wheat breeding by fusing genomics and phenomics
title_short Multi-modal deep learning improves grain yield prediction in wheat breeding by fusing genomics and phenomics
title_sort multi-modal deep learning improves grain yield prediction in wheat breeding by fusing genomics and phenomics
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10246581/
https://www.ncbi.nlm.nih.gov/pubmed/37220903
http://dx.doi.org/10.1093/bioinformatics/btad336
work_keys_str_mv AT togninallimatteo multimodaldeeplearningimprovesgrainyieldpredictioninwheatbreedingbyfusinggenomicsandphenomics
AT wangxu multimodaldeeplearningimprovesgrainyieldpredictioninwheatbreedingbyfusinggenomicsandphenomics
AT kuceratim multimodaldeeplearningimprovesgrainyieldpredictioninwheatbreedingbyfusinggenomicsandphenomics
AT shresthasandesh multimodaldeeplearningimprovesgrainyieldpredictioninwheatbreedingbyfusinggenomicsandphenomics
AT julianaphilomin multimodaldeeplearningimprovesgrainyieldpredictioninwheatbreedingbyfusinggenomicsandphenomics
AT mondalsuchismita multimodaldeeplearningimprovesgrainyieldpredictioninwheatbreedingbyfusinggenomicsandphenomics
AT pintofrancisco multimodaldeeplearningimprovesgrainyieldpredictioninwheatbreedingbyfusinggenomicsandphenomics
AT govindanvelu multimodaldeeplearningimprovesgrainyieldpredictioninwheatbreedingbyfusinggenomicsandphenomics
AT crespoherreraleonardo multimodaldeeplearningimprovesgrainyieldpredictioninwheatbreedingbyfusinggenomicsandphenomics
AT huertaespinojulio multimodaldeeplearningimprovesgrainyieldpredictioninwheatbreedingbyfusinggenomicsandphenomics
AT singhravip multimodaldeeplearningimprovesgrainyieldpredictioninwheatbreedingbyfusinggenomicsandphenomics
AT borgwardtkarsten multimodaldeeplearningimprovesgrainyieldpredictioninwheatbreedingbyfusinggenomicsandphenomics
AT polandjesse multimodaldeeplearningimprovesgrainyieldpredictioninwheatbreedingbyfusinggenomicsandphenomics