Cargando…
Deep polygenic neural network for predicting and identifying yield-associated genes in Indonesian rice accessions
As the fourth most populous country in the world, Indonesia must increase the annual rice production rate to achieve national food security by 2050. One possible solution comes from the nanoscopic level: a genetic variant called Single Nucleotide Polymorphism (SNP), which can express significant yie...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9378700/ https://www.ncbi.nlm.nih.gov/pubmed/35970979 http://dx.doi.org/10.1038/s41598-022-16075-9 |
_version_ | 1784768572646490112 |
---|---|
author | Dominic, Nicholas Cenggoro, Tjeng Wawan Budiarto, Arif Pardamean, Bens |
author_facet | Dominic, Nicholas Cenggoro, Tjeng Wawan Budiarto, Arif Pardamean, Bens |
author_sort | Dominic, Nicholas |
collection | PubMed |
description | As the fourth most populous country in the world, Indonesia must increase the annual rice production rate to achieve national food security by 2050. One possible solution comes from the nanoscopic level: a genetic variant called Single Nucleotide Polymorphism (SNP), which can express significant yield-associated genes. The prior benchmark of this study utilized a statistical genetics model where no SNP position information and attention mechanism were involved. Hence, we developed a novel deep polygenic neural network, named the NucleoNet model, to address these obstacles. The NucleoNets were constructed with the combination of prominent components that include positional SNP encoding, the context vector, wide models, Elastic Net, and Shannon’s entropy loss. This polygenic modeling obtained up to 2.779 of Mean Squared Error (MSE) with 47.156% of Symmetric Mean Absolute Percentage Error (SMAPE), while revealing 15 new important SNPs. Furthermore, the NucleoNets reduced the MSE score up to 32.28% compared to the Ordinary Least Squares (OLS) model. Through the ablation study, we learned that the combination of Xavier distribution for weights initialization and Normal distribution for biases initialization sparked more various important SNPs throughout 12 chromosomes. Our findings confirmed that the NucleoNet model was successfully outperformed the OLS model and identified important SNPs to Indonesian rice yields. |
format | Online Article Text |
id | pubmed-9378700 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-93787002022-08-17 Deep polygenic neural network for predicting and identifying yield-associated genes in Indonesian rice accessions Dominic, Nicholas Cenggoro, Tjeng Wawan Budiarto, Arif Pardamean, Bens Sci Rep Article As the fourth most populous country in the world, Indonesia must increase the annual rice production rate to achieve national food security by 2050. One possible solution comes from the nanoscopic level: a genetic variant called Single Nucleotide Polymorphism (SNP), which can express significant yield-associated genes. The prior benchmark of this study utilized a statistical genetics model where no SNP position information and attention mechanism were involved. Hence, we developed a novel deep polygenic neural network, named the NucleoNet model, to address these obstacles. The NucleoNets were constructed with the combination of prominent components that include positional SNP encoding, the context vector, wide models, Elastic Net, and Shannon’s entropy loss. This polygenic modeling obtained up to 2.779 of Mean Squared Error (MSE) with 47.156% of Symmetric Mean Absolute Percentage Error (SMAPE), while revealing 15 new important SNPs. Furthermore, the NucleoNets reduced the MSE score up to 32.28% compared to the Ordinary Least Squares (OLS) model. Through the ablation study, we learned that the combination of Xavier distribution for weights initialization and Normal distribution for biases initialization sparked more various important SNPs throughout 12 chromosomes. Our findings confirmed that the NucleoNet model was successfully outperformed the OLS model and identified important SNPs to Indonesian rice yields. Nature Publishing Group UK 2022-08-15 /pmc/articles/PMC9378700/ /pubmed/35970979 http://dx.doi.org/10.1038/s41598-022-16075-9 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Dominic, Nicholas Cenggoro, Tjeng Wawan Budiarto, Arif Pardamean, Bens Deep polygenic neural network for predicting and identifying yield-associated genes in Indonesian rice accessions |
title | Deep polygenic neural network for predicting and identifying yield-associated genes in Indonesian rice accessions |
title_full | Deep polygenic neural network for predicting and identifying yield-associated genes in Indonesian rice accessions |
title_fullStr | Deep polygenic neural network for predicting and identifying yield-associated genes in Indonesian rice accessions |
title_full_unstemmed | Deep polygenic neural network for predicting and identifying yield-associated genes in Indonesian rice accessions |
title_short | Deep polygenic neural network for predicting and identifying yield-associated genes in Indonesian rice accessions |
title_sort | deep polygenic neural network for predicting and identifying yield-associated genes in indonesian rice accessions |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9378700/ https://www.ncbi.nlm.nih.gov/pubmed/35970979 http://dx.doi.org/10.1038/s41598-022-16075-9 |
work_keys_str_mv | AT dominicnicholas deeppolygenicneuralnetworkforpredictingandidentifyingyieldassociatedgenesinindonesianriceaccessions AT cenggorotjengwawan deeppolygenicneuralnetworkforpredictingandidentifyingyieldassociatedgenesinindonesianriceaccessions AT budiartoarif deeppolygenicneuralnetworkforpredictingandidentifyingyieldassociatedgenesinindonesianriceaccessions AT pardameanbens deeppolygenicneuralnetworkforpredictingandidentifyingyieldassociatedgenesinindonesianriceaccessions |