Cargando…

A Comparative Analysis of Novel Deep Learning and Ensemble Learning Models to Predict the Allergenicity of Food Proteins

Traditional food allergen identification mainly relies on in vivo and in vitro experiments, which often needs a long period and high cost. The artificial intelligence (AI)-driven rapid food allergen identification method has solved the above mentioned some drawbacks and is becoming an efficient auxi...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Liyang, Niu, Dantong, Zhao, Xinjie, Wang, Xiaoya, Hao, Mengzhen, Che, Huilian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8069377/
https://www.ncbi.nlm.nih.gov/pubmed/33918556
http://dx.doi.org/10.3390/foods10040809
_version_ 1783683223007526912
author Wang, Liyang
Niu, Dantong
Zhao, Xinjie
Wang, Xiaoya
Hao, Mengzhen
Che, Huilian
author_facet Wang, Liyang
Niu, Dantong
Zhao, Xinjie
Wang, Xiaoya
Hao, Mengzhen
Che, Huilian
author_sort Wang, Liyang
collection PubMed
description Traditional food allergen identification mainly relies on in vivo and in vitro experiments, which often needs a long period and high cost. The artificial intelligence (AI)-driven rapid food allergen identification method has solved the above mentioned some drawbacks and is becoming an efficient auxiliary tool. Aiming to overcome the limitations of lower accuracy of traditional machine learning models in predicting the allergenicity of food proteins, this work proposed to introduce deep learning model—transformer with self-attention mechanism, ensemble learning models (representative as Light Gradient Boosting Machine (LightGBM) eXtreme Gradient Boosting (XGBoost)) to solve the problem. In order to highlight the superiority of the proposed novel method, the study also selected various commonly used machine learning models as the baseline classifiers. The results of 5-fold cross-validation showed that the area under the receiver operating characteristic curve (AUC) of the deep model was the highest (0.9578), which was better than the ensemble learning and baseline algorithms. But the deep model need to be pre-trained, and the training time is the longest. By comparing the characteristics of the transformer model and boosting models, it can be analyzed that, each model has its own advantage, which provides novel clues and inspiration for the rapid prediction of food allergens in the future.
format Online
Article
Text
id pubmed-8069377
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-80693772021-04-26 A Comparative Analysis of Novel Deep Learning and Ensemble Learning Models to Predict the Allergenicity of Food Proteins Wang, Liyang Niu, Dantong Zhao, Xinjie Wang, Xiaoya Hao, Mengzhen Che, Huilian Foods Article Traditional food allergen identification mainly relies on in vivo and in vitro experiments, which often needs a long period and high cost. The artificial intelligence (AI)-driven rapid food allergen identification method has solved the above mentioned some drawbacks and is becoming an efficient auxiliary tool. Aiming to overcome the limitations of lower accuracy of traditional machine learning models in predicting the allergenicity of food proteins, this work proposed to introduce deep learning model—transformer with self-attention mechanism, ensemble learning models (representative as Light Gradient Boosting Machine (LightGBM) eXtreme Gradient Boosting (XGBoost)) to solve the problem. In order to highlight the superiority of the proposed novel method, the study also selected various commonly used machine learning models as the baseline classifiers. The results of 5-fold cross-validation showed that the area under the receiver operating characteristic curve (AUC) of the deep model was the highest (0.9578), which was better than the ensemble learning and baseline algorithms. But the deep model need to be pre-trained, and the training time is the longest. By comparing the characteristics of the transformer model and boosting models, it can be analyzed that, each model has its own advantage, which provides novel clues and inspiration for the rapid prediction of food allergens in the future. MDPI 2021-04-09 /pmc/articles/PMC8069377/ /pubmed/33918556 http://dx.doi.org/10.3390/foods10040809 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Wang, Liyang
Niu, Dantong
Zhao, Xinjie
Wang, Xiaoya
Hao, Mengzhen
Che, Huilian
A Comparative Analysis of Novel Deep Learning and Ensemble Learning Models to Predict the Allergenicity of Food Proteins
title A Comparative Analysis of Novel Deep Learning and Ensemble Learning Models to Predict the Allergenicity of Food Proteins
title_full A Comparative Analysis of Novel Deep Learning and Ensemble Learning Models to Predict the Allergenicity of Food Proteins
title_fullStr A Comparative Analysis of Novel Deep Learning and Ensemble Learning Models to Predict the Allergenicity of Food Proteins
title_full_unstemmed A Comparative Analysis of Novel Deep Learning and Ensemble Learning Models to Predict the Allergenicity of Food Proteins
title_short A Comparative Analysis of Novel Deep Learning and Ensemble Learning Models to Predict the Allergenicity of Food Proteins
title_sort comparative analysis of novel deep learning and ensemble learning models to predict the allergenicity of food proteins
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8069377/
https://www.ncbi.nlm.nih.gov/pubmed/33918556
http://dx.doi.org/10.3390/foods10040809
work_keys_str_mv AT wangliyang acomparativeanalysisofnoveldeeplearningandensemblelearningmodelstopredicttheallergenicityoffoodproteins
AT niudantong acomparativeanalysisofnoveldeeplearningandensemblelearningmodelstopredicttheallergenicityoffoodproteins
AT zhaoxinjie acomparativeanalysisofnoveldeeplearningandensemblelearningmodelstopredicttheallergenicityoffoodproteins
AT wangxiaoya acomparativeanalysisofnoveldeeplearningandensemblelearningmodelstopredicttheallergenicityoffoodproteins
AT haomengzhen acomparativeanalysisofnoveldeeplearningandensemblelearningmodelstopredicttheallergenicityoffoodproteins
AT chehuilian acomparativeanalysisofnoveldeeplearningandensemblelearningmodelstopredicttheallergenicityoffoodproteins
AT wangliyang comparativeanalysisofnoveldeeplearningandensemblelearningmodelstopredicttheallergenicityoffoodproteins
AT niudantong comparativeanalysisofnoveldeeplearningandensemblelearningmodelstopredicttheallergenicityoffoodproteins
AT zhaoxinjie comparativeanalysisofnoveldeeplearningandensemblelearningmodelstopredicttheallergenicityoffoodproteins
AT wangxiaoya comparativeanalysisofnoveldeeplearningandensemblelearningmodelstopredicttheallergenicityoffoodproteins
AT haomengzhen comparativeanalysisofnoveldeeplearningandensemblelearningmodelstopredicttheallergenicityoffoodproteins
AT chehuilian comparativeanalysisofnoveldeeplearningandensemblelearningmodelstopredicttheallergenicityoffoodproteins