Cargando…

Comparison of machine learning and logistic regression as predictive models for adverse maternal and neonatal outcomes of preeclampsia: A retrospective study

INTRODUCTION: Preeclampsia, one of the leading causes of maternal and fetal morbidity and mortality, demands accurate predictive models for the lack of effective treatment. Predictive models based on machine learning algorithms demonstrate promising potential, while there is a controversial discussi...

Descripción completa

Detalles Bibliográficos
Autores principales: Zheng, Dongying, Hao, Xinyu, Khan, Muhanmmad, Wang, Lixia, Li, Fan, Xiang, Ning, Kang, Fuli, Hamalainen, Timo, Cong, Fengyu, Song, Kedong, Qiao, Chong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9596815/
https://www.ncbi.nlm.nih.gov/pubmed/36312231
http://dx.doi.org/10.3389/fcvm.2022.959649
_version_ 1784815950372012032
author Zheng, Dongying
Hao, Xinyu
Khan, Muhanmmad
Wang, Lixia
Li, Fan
Xiang, Ning
Kang, Fuli
Hamalainen, Timo
Cong, Fengyu
Song, Kedong
Qiao, Chong
author_facet Zheng, Dongying
Hao, Xinyu
Khan, Muhanmmad
Wang, Lixia
Li, Fan
Xiang, Ning
Kang, Fuli
Hamalainen, Timo
Cong, Fengyu
Song, Kedong
Qiao, Chong
author_sort Zheng, Dongying
collection PubMed
description INTRODUCTION: Preeclampsia, one of the leading causes of maternal and fetal morbidity and mortality, demands accurate predictive models for the lack of effective treatment. Predictive models based on machine learning algorithms demonstrate promising potential, while there is a controversial discussion about whether machine learning methods should be recommended preferably, compared to traditional statistical models. METHODS: We employed both logistic regression and six machine learning methods as binary predictive models for a dataset containing 733 women diagnosed with preeclampsia. Participants were grouped by four different pregnancy outcomes. After the imputation of missing values, statistical description and comparison were conducted preliminarily to explore the characteristics of documented 73 variables. Sequentially, correlation analysis and feature selection were performed as preprocessing steps to filter contributing variables for developing models. The models were evaluated by multiple criteria. RESULTS: We first figured out that the influential variables screened by preprocessing steps did not overlap with those determined by statistical differences. Secondly, the most accurate imputation method is K-Nearest Neighbor, and the imputation process did not affect the performance of the developed models much. Finally, the performance of models was investigated. The random forest classifier, multi-layer perceptron, and support vector machine demonstrated better discriminative power for prediction evaluated by the area under the receiver operating characteristic curve, while the decision tree classifier, random forest, and logistic regression yielded better calibration ability verified, as by the calibration curve. CONCLUSION: Machine learning algorithms can accomplish prediction modeling and demonstrate superior discrimination, while Logistic Regression can be calibrated well. Statistical analysis and machine learning are two scientific domains sharing similar themes. The predictive abilities of such developed models vary according to the characteristics of datasets, which still need larger sample sizes and more influential predictors to accumulate evidence.
format Online
Article
Text
id pubmed-9596815
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-95968152022-10-27 Comparison of machine learning and logistic regression as predictive models for adverse maternal and neonatal outcomes of preeclampsia: A retrospective study Zheng, Dongying Hao, Xinyu Khan, Muhanmmad Wang, Lixia Li, Fan Xiang, Ning Kang, Fuli Hamalainen, Timo Cong, Fengyu Song, Kedong Qiao, Chong Front Cardiovasc Med Cardiovascular Medicine INTRODUCTION: Preeclampsia, one of the leading causes of maternal and fetal morbidity and mortality, demands accurate predictive models for the lack of effective treatment. Predictive models based on machine learning algorithms demonstrate promising potential, while there is a controversial discussion about whether machine learning methods should be recommended preferably, compared to traditional statistical models. METHODS: We employed both logistic regression and six machine learning methods as binary predictive models for a dataset containing 733 women diagnosed with preeclampsia. Participants were grouped by four different pregnancy outcomes. After the imputation of missing values, statistical description and comparison were conducted preliminarily to explore the characteristics of documented 73 variables. Sequentially, correlation analysis and feature selection were performed as preprocessing steps to filter contributing variables for developing models. The models were evaluated by multiple criteria. RESULTS: We first figured out that the influential variables screened by preprocessing steps did not overlap with those determined by statistical differences. Secondly, the most accurate imputation method is K-Nearest Neighbor, and the imputation process did not affect the performance of the developed models much. Finally, the performance of models was investigated. The random forest classifier, multi-layer perceptron, and support vector machine demonstrated better discriminative power for prediction evaluated by the area under the receiver operating characteristic curve, while the decision tree classifier, random forest, and logistic regression yielded better calibration ability verified, as by the calibration curve. CONCLUSION: Machine learning algorithms can accomplish prediction modeling and demonstrate superior discrimination, while Logistic Regression can be calibrated well. Statistical analysis and machine learning are two scientific domains sharing similar themes. The predictive abilities of such developed models vary according to the characteristics of datasets, which still need larger sample sizes and more influential predictors to accumulate evidence. Frontiers Media S.A. 2022-10-12 /pmc/articles/PMC9596815/ /pubmed/36312231 http://dx.doi.org/10.3389/fcvm.2022.959649 Text en Copyright © 2022 Zheng, Hao, Khan, Wang, Li, Xiang, Kang, Hamalainen, Cong, Song and Qiao. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Cardiovascular Medicine
Zheng, Dongying
Hao, Xinyu
Khan, Muhanmmad
Wang, Lixia
Li, Fan
Xiang, Ning
Kang, Fuli
Hamalainen, Timo
Cong, Fengyu
Song, Kedong
Qiao, Chong
Comparison of machine learning and logistic regression as predictive models for adverse maternal and neonatal outcomes of preeclampsia: A retrospective study
title Comparison of machine learning and logistic regression as predictive models for adverse maternal and neonatal outcomes of preeclampsia: A retrospective study
title_full Comparison of machine learning and logistic regression as predictive models for adverse maternal and neonatal outcomes of preeclampsia: A retrospective study
title_fullStr Comparison of machine learning and logistic regression as predictive models for adverse maternal and neonatal outcomes of preeclampsia: A retrospective study
title_full_unstemmed Comparison of machine learning and logistic regression as predictive models for adverse maternal and neonatal outcomes of preeclampsia: A retrospective study
title_short Comparison of machine learning and logistic regression as predictive models for adverse maternal and neonatal outcomes of preeclampsia: A retrospective study
title_sort comparison of machine learning and logistic regression as predictive models for adverse maternal and neonatal outcomes of preeclampsia: a retrospective study
topic Cardiovascular Medicine
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9596815/
https://www.ncbi.nlm.nih.gov/pubmed/36312231
http://dx.doi.org/10.3389/fcvm.2022.959649
work_keys_str_mv AT zhengdongying comparisonofmachinelearningandlogisticregressionaspredictivemodelsforadversematernalandneonataloutcomesofpreeclampsiaaretrospectivestudy
AT haoxinyu comparisonofmachinelearningandlogisticregressionaspredictivemodelsforadversematernalandneonataloutcomesofpreeclampsiaaretrospectivestudy
AT khanmuhanmmad comparisonofmachinelearningandlogisticregressionaspredictivemodelsforadversematernalandneonataloutcomesofpreeclampsiaaretrospectivestudy
AT wanglixia comparisonofmachinelearningandlogisticregressionaspredictivemodelsforadversematernalandneonataloutcomesofpreeclampsiaaretrospectivestudy
AT lifan comparisonofmachinelearningandlogisticregressionaspredictivemodelsforadversematernalandneonataloutcomesofpreeclampsiaaretrospectivestudy
AT xiangning comparisonofmachinelearningandlogisticregressionaspredictivemodelsforadversematernalandneonataloutcomesofpreeclampsiaaretrospectivestudy
AT kangfuli comparisonofmachinelearningandlogisticregressionaspredictivemodelsforadversematernalandneonataloutcomesofpreeclampsiaaretrospectivestudy
AT hamalainentimo comparisonofmachinelearningandlogisticregressionaspredictivemodelsforadversematernalandneonataloutcomesofpreeclampsiaaretrospectivestudy
AT congfengyu comparisonofmachinelearningandlogisticregressionaspredictivemodelsforadversematernalandneonataloutcomesofpreeclampsiaaretrospectivestudy
AT songkedong comparisonofmachinelearningandlogisticregressionaspredictivemodelsforadversematernalandneonataloutcomesofpreeclampsiaaretrospectivestudy
AT qiaochong comparisonofmachinelearningandlogisticregressionaspredictivemodelsforadversematernalandneonataloutcomesofpreeclampsiaaretrospectivestudy