Cargando…
Prediction of Gestational Diabetes Mellitus under Cascade and Ensemble Learning Algorithm
Gestational diabetes mellitus (GDM) is one of the risk factors for fetal dysplasia and maternal pregnancy difficulties. Therefore, the prediction of the risk of GDM in advance has become a big demand for millions of families. Therefore, machine learning technology is adopted to study GDM prediction....
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9303101/ https://www.ncbi.nlm.nih.gov/pubmed/35875747 http://dx.doi.org/10.1155/2022/3212738 |
_version_ | 1784751779333799936 |
---|---|
author | Zhang, Jie Wang, Fang |
author_facet | Zhang, Jie Wang, Fang |
author_sort | Zhang, Jie |
collection | PubMed |
description | Gestational diabetes mellitus (GDM) is one of the risk factors for fetal dysplasia and maternal pregnancy difficulties. Therefore, the prediction of the risk of GDM in advance has become a big demand for millions of families. Therefore, machine learning technology is adopted to study GDM prediction. Firstly, the data is preprocessed, and the mean value is used for outlier processing. After preprocessing of the data, the IV value method is used to screen the features. Of the 83 features in the original sample data, 40 important features are screened out through feature engineering. On this basis, Logistics regression model, Lasso-Logistics, Gradient Boosting Decision Tree (GBDT), Extreme Gradient Boosting (Xgboost), Light Gradient Boosting Machine (Lightgbm), and Gradient Boosting Categorical Features (Catboost) are established, and multiple learners are integrated. Finally, the constructed model is tested on data sets. The accuracy of the proposed model is 80.3%, the accuracy is 74.6%, the recall rate is 79.3%, and the running time is only 2.53 seconds. This means that the proposed model is superior to the previous models in terms of accuracy, precision, recall rate, and F1 value, and the time consumption is also in line with the actual engineering requirements. The proposed scheme provides some ideas for the research of machine learning technology in disease prediction. |
format | Online Article Text |
id | pubmed-9303101 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Hindawi |
record_format | MEDLINE/PubMed |
spelling | pubmed-93031012022-07-22 Prediction of Gestational Diabetes Mellitus under Cascade and Ensemble Learning Algorithm Zhang, Jie Wang, Fang Comput Intell Neurosci Research Article Gestational diabetes mellitus (GDM) is one of the risk factors for fetal dysplasia and maternal pregnancy difficulties. Therefore, the prediction of the risk of GDM in advance has become a big demand for millions of families. Therefore, machine learning technology is adopted to study GDM prediction. Firstly, the data is preprocessed, and the mean value is used for outlier processing. After preprocessing of the data, the IV value method is used to screen the features. Of the 83 features in the original sample data, 40 important features are screened out through feature engineering. On this basis, Logistics regression model, Lasso-Logistics, Gradient Boosting Decision Tree (GBDT), Extreme Gradient Boosting (Xgboost), Light Gradient Boosting Machine (Lightgbm), and Gradient Boosting Categorical Features (Catboost) are established, and multiple learners are integrated. Finally, the constructed model is tested on data sets. The accuracy of the proposed model is 80.3%, the accuracy is 74.6%, the recall rate is 79.3%, and the running time is only 2.53 seconds. This means that the proposed model is superior to the previous models in terms of accuracy, precision, recall rate, and F1 value, and the time consumption is also in line with the actual engineering requirements. The proposed scheme provides some ideas for the research of machine learning technology in disease prediction. Hindawi 2022-07-14 /pmc/articles/PMC9303101/ /pubmed/35875747 http://dx.doi.org/10.1155/2022/3212738 Text en Copyright © 2022 Jie Zhang and Fang Wang. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Zhang, Jie Wang, Fang Prediction of Gestational Diabetes Mellitus under Cascade and Ensemble Learning Algorithm |
title | Prediction of Gestational Diabetes Mellitus under Cascade and Ensemble Learning Algorithm |
title_full | Prediction of Gestational Diabetes Mellitus under Cascade and Ensemble Learning Algorithm |
title_fullStr | Prediction of Gestational Diabetes Mellitus under Cascade and Ensemble Learning Algorithm |
title_full_unstemmed | Prediction of Gestational Diabetes Mellitus under Cascade and Ensemble Learning Algorithm |
title_short | Prediction of Gestational Diabetes Mellitus under Cascade and Ensemble Learning Algorithm |
title_sort | prediction of gestational diabetes mellitus under cascade and ensemble learning algorithm |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9303101/ https://www.ncbi.nlm.nih.gov/pubmed/35875747 http://dx.doi.org/10.1155/2022/3212738 |
work_keys_str_mv | AT zhangjie predictionofgestationaldiabetesmellitusundercascadeandensemblelearningalgorithm AT wangfang predictionofgestationaldiabetesmellitusundercascadeandensemblelearningalgorithm |