Cargando…
Comparison of machine learning and logistic regression as predictive models for adverse maternal and neonatal outcomes of preeclampsia: A retrospective study
INTRODUCTION: Preeclampsia, one of the leading causes of maternal and fetal morbidity and mortality, demands accurate predictive models for the lack of effective treatment. Predictive models based on machine learning algorithms demonstrate promising potential, while there is a controversial discussi...
Autores principales: | , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9596815/ https://www.ncbi.nlm.nih.gov/pubmed/36312231 http://dx.doi.org/10.3389/fcvm.2022.959649 |
_version_ | 1784815950372012032 |
---|---|
author | Zheng, Dongying Hao, Xinyu Khan, Muhanmmad Wang, Lixia Li, Fan Xiang, Ning Kang, Fuli Hamalainen, Timo Cong, Fengyu Song, Kedong Qiao, Chong |
author_facet | Zheng, Dongying Hao, Xinyu Khan, Muhanmmad Wang, Lixia Li, Fan Xiang, Ning Kang, Fuli Hamalainen, Timo Cong, Fengyu Song, Kedong Qiao, Chong |
author_sort | Zheng, Dongying |
collection | PubMed |
description | INTRODUCTION: Preeclampsia, one of the leading causes of maternal and fetal morbidity and mortality, demands accurate predictive models for the lack of effective treatment. Predictive models based on machine learning algorithms demonstrate promising potential, while there is a controversial discussion about whether machine learning methods should be recommended preferably, compared to traditional statistical models. METHODS: We employed both logistic regression and six machine learning methods as binary predictive models for a dataset containing 733 women diagnosed with preeclampsia. Participants were grouped by four different pregnancy outcomes. After the imputation of missing values, statistical description and comparison were conducted preliminarily to explore the characteristics of documented 73 variables. Sequentially, correlation analysis and feature selection were performed as preprocessing steps to filter contributing variables for developing models. The models were evaluated by multiple criteria. RESULTS: We first figured out that the influential variables screened by preprocessing steps did not overlap with those determined by statistical differences. Secondly, the most accurate imputation method is K-Nearest Neighbor, and the imputation process did not affect the performance of the developed models much. Finally, the performance of models was investigated. The random forest classifier, multi-layer perceptron, and support vector machine demonstrated better discriminative power for prediction evaluated by the area under the receiver operating characteristic curve, while the decision tree classifier, random forest, and logistic regression yielded better calibration ability verified, as by the calibration curve. CONCLUSION: Machine learning algorithms can accomplish prediction modeling and demonstrate superior discrimination, while Logistic Regression can be calibrated well. Statistical analysis and machine learning are two scientific domains sharing similar themes. The predictive abilities of such developed models vary according to the characteristics of datasets, which still need larger sample sizes and more influential predictors to accumulate evidence. |
format | Online Article Text |
id | pubmed-9596815 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-95968152022-10-27 Comparison of machine learning and logistic regression as predictive models for adverse maternal and neonatal outcomes of preeclampsia: A retrospective study Zheng, Dongying Hao, Xinyu Khan, Muhanmmad Wang, Lixia Li, Fan Xiang, Ning Kang, Fuli Hamalainen, Timo Cong, Fengyu Song, Kedong Qiao, Chong Front Cardiovasc Med Cardiovascular Medicine INTRODUCTION: Preeclampsia, one of the leading causes of maternal and fetal morbidity and mortality, demands accurate predictive models for the lack of effective treatment. Predictive models based on machine learning algorithms demonstrate promising potential, while there is a controversial discussion about whether machine learning methods should be recommended preferably, compared to traditional statistical models. METHODS: We employed both logistic regression and six machine learning methods as binary predictive models for a dataset containing 733 women diagnosed with preeclampsia. Participants were grouped by four different pregnancy outcomes. After the imputation of missing values, statistical description and comparison were conducted preliminarily to explore the characteristics of documented 73 variables. Sequentially, correlation analysis and feature selection were performed as preprocessing steps to filter contributing variables for developing models. The models were evaluated by multiple criteria. RESULTS: We first figured out that the influential variables screened by preprocessing steps did not overlap with those determined by statistical differences. Secondly, the most accurate imputation method is K-Nearest Neighbor, and the imputation process did not affect the performance of the developed models much. Finally, the performance of models was investigated. The random forest classifier, multi-layer perceptron, and support vector machine demonstrated better discriminative power for prediction evaluated by the area under the receiver operating characteristic curve, while the decision tree classifier, random forest, and logistic regression yielded better calibration ability verified, as by the calibration curve. CONCLUSION: Machine learning algorithms can accomplish prediction modeling and demonstrate superior discrimination, while Logistic Regression can be calibrated well. Statistical analysis and machine learning are two scientific domains sharing similar themes. The predictive abilities of such developed models vary according to the characteristics of datasets, which still need larger sample sizes and more influential predictors to accumulate evidence. Frontiers Media S.A. 2022-10-12 /pmc/articles/PMC9596815/ /pubmed/36312231 http://dx.doi.org/10.3389/fcvm.2022.959649 Text en Copyright © 2022 Zheng, Hao, Khan, Wang, Li, Xiang, Kang, Hamalainen, Cong, Song and Qiao. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Cardiovascular Medicine Zheng, Dongying Hao, Xinyu Khan, Muhanmmad Wang, Lixia Li, Fan Xiang, Ning Kang, Fuli Hamalainen, Timo Cong, Fengyu Song, Kedong Qiao, Chong Comparison of machine learning and logistic regression as predictive models for adverse maternal and neonatal outcomes of preeclampsia: A retrospective study |
title | Comparison of machine learning and logistic regression as predictive models for adverse maternal and neonatal outcomes of preeclampsia: A retrospective study |
title_full | Comparison of machine learning and logistic regression as predictive models for adverse maternal and neonatal outcomes of preeclampsia: A retrospective study |
title_fullStr | Comparison of machine learning and logistic regression as predictive models for adverse maternal and neonatal outcomes of preeclampsia: A retrospective study |
title_full_unstemmed | Comparison of machine learning and logistic regression as predictive models for adverse maternal and neonatal outcomes of preeclampsia: A retrospective study |
title_short | Comparison of machine learning and logistic regression as predictive models for adverse maternal and neonatal outcomes of preeclampsia: A retrospective study |
title_sort | comparison of machine learning and logistic regression as predictive models for adverse maternal and neonatal outcomes of preeclampsia: a retrospective study |
topic | Cardiovascular Medicine |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9596815/ https://www.ncbi.nlm.nih.gov/pubmed/36312231 http://dx.doi.org/10.3389/fcvm.2022.959649 |
work_keys_str_mv | AT zhengdongying comparisonofmachinelearningandlogisticregressionaspredictivemodelsforadversematernalandneonataloutcomesofpreeclampsiaaretrospectivestudy AT haoxinyu comparisonofmachinelearningandlogisticregressionaspredictivemodelsforadversematernalandneonataloutcomesofpreeclampsiaaretrospectivestudy AT khanmuhanmmad comparisonofmachinelearningandlogisticregressionaspredictivemodelsforadversematernalandneonataloutcomesofpreeclampsiaaretrospectivestudy AT wanglixia comparisonofmachinelearningandlogisticregressionaspredictivemodelsforadversematernalandneonataloutcomesofpreeclampsiaaretrospectivestudy AT lifan comparisonofmachinelearningandlogisticregressionaspredictivemodelsforadversematernalandneonataloutcomesofpreeclampsiaaretrospectivestudy AT xiangning comparisonofmachinelearningandlogisticregressionaspredictivemodelsforadversematernalandneonataloutcomesofpreeclampsiaaretrospectivestudy AT kangfuli comparisonofmachinelearningandlogisticregressionaspredictivemodelsforadversematernalandneonataloutcomesofpreeclampsiaaretrospectivestudy AT hamalainentimo comparisonofmachinelearningandlogisticregressionaspredictivemodelsforadversematernalandneonataloutcomesofpreeclampsiaaretrospectivestudy AT congfengyu comparisonofmachinelearningandlogisticregressionaspredictivemodelsforadversematernalandneonataloutcomesofpreeclampsiaaretrospectivestudy AT songkedong comparisonofmachinelearningandlogisticregressionaspredictivemodelsforadversematernalandneonataloutcomesofpreeclampsiaaretrospectivestudy AT qiaochong comparisonofmachinelearningandlogisticregressionaspredictivemodelsforadversematernalandneonataloutcomesofpreeclampsiaaretrospectivestudy |