Cargando…

Machine Learning Algorithms for understanding the determinants of under-five Mortality

BACKGROUND: Under-five mortality is a matter of serious concern for child health as well as the social development of any country. The paper aimed to find the accuracy of machine learning models in predicting under-five mortality and identify the most significant factors associated with under-five m...

Descripción completa

Detalles Bibliográficos
Autores principales: Saroj, Rakesh Kumar, Yadav, Pawan Kumar, Singh, Rajneesh, Chilyabanyama, Obvious.N.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9509654/
https://www.ncbi.nlm.nih.gov/pubmed/36153553
http://dx.doi.org/10.1186/s13040-022-00308-8
_version_ 1784797275773468672
author Saroj, Rakesh Kumar
Yadav, Pawan Kumar
Singh, Rajneesh
Chilyabanyama, Obvious.N.
author_facet Saroj, Rakesh Kumar
Yadav, Pawan Kumar
Singh, Rajneesh
Chilyabanyama, Obvious.N.
author_sort Saroj, Rakesh Kumar
collection PubMed
description BACKGROUND: Under-five mortality is a matter of serious concern for child health as well as the social development of any country. The paper aimed to find the accuracy of machine learning models in predicting under-five mortality and identify the most significant factors associated with under-five mortality. METHOD: The data was taken from the National Family Health Survey (NFHS-IV) of Uttar Pradesh. First, we used multivariate logistic regression due to its capability for predicting the important factors, then we used machine learning techniques such as decision tree, random forest, Naïve Bayes, K- nearest neighbor (KNN), logistic regression, support vector machine (SVM), neural network, and ridge classifier. Each model’s accuracy was checked by a confusion matrix, accuracy, precision, recall, F1 score, Cohen’s Kappa, and area under the receiver operating characteristics curve (AUROC). Information gain rank was used to find the important factors for under-five mortality. Data analysis was performed using, STATA-16.0, Python 3.3, and IBM SPSS Statistics for Windows, Version 27.0 software. RESULT: By applying the machine learning models, results showed that the neural network model was the best predictive model for under-five mortality when compared with other predictive models, with model accuracy of (95.29% to 95.96%), recall (71.51% to 81.03%), precision (36.64% to 51.83%), F1 score (50.46% to 62.68%), Cohen’s Kappa value (0.48 to 0.60), AUROC range (93.51% to 96.22%) and precision-recall curve range (99.52% to 99.73%). The neural network was the most efficient model, but logistic regression also shows well for predicting under-five mortality with accuracy (94% to 95%)., AUROC range (93.4% to 94.8%), and precision-recall curve (99.5% to 99.6%). The number of living children, survival time, wealth index, child size at birth, birth in the last five years, the total number of children ever born, mother’s education level, and birth order were identified as important factors influencing under-five mortality. CONCLUSION: The neural network model was a better predictive model compared to other machine learning models in predicting under-five mortality, but logistic regression analysis also shows good results. These models may be helpful for the analysis of high-dimensional data for health research.
format Online
Article
Text
id pubmed-9509654
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-95096542022-09-26 Machine Learning Algorithms for understanding the determinants of under-five Mortality Saroj, Rakesh Kumar Yadav, Pawan Kumar Singh, Rajneesh Chilyabanyama, Obvious.N. BioData Min Methodology BACKGROUND: Under-five mortality is a matter of serious concern for child health as well as the social development of any country. The paper aimed to find the accuracy of machine learning models in predicting under-five mortality and identify the most significant factors associated with under-five mortality. METHOD: The data was taken from the National Family Health Survey (NFHS-IV) of Uttar Pradesh. First, we used multivariate logistic regression due to its capability for predicting the important factors, then we used machine learning techniques such as decision tree, random forest, Naïve Bayes, K- nearest neighbor (KNN), logistic regression, support vector machine (SVM), neural network, and ridge classifier. Each model’s accuracy was checked by a confusion matrix, accuracy, precision, recall, F1 score, Cohen’s Kappa, and area under the receiver operating characteristics curve (AUROC). Information gain rank was used to find the important factors for under-five mortality. Data analysis was performed using, STATA-16.0, Python 3.3, and IBM SPSS Statistics for Windows, Version 27.0 software. RESULT: By applying the machine learning models, results showed that the neural network model was the best predictive model for under-five mortality when compared with other predictive models, with model accuracy of (95.29% to 95.96%), recall (71.51% to 81.03%), precision (36.64% to 51.83%), F1 score (50.46% to 62.68%), Cohen’s Kappa value (0.48 to 0.60), AUROC range (93.51% to 96.22%) and precision-recall curve range (99.52% to 99.73%). The neural network was the most efficient model, but logistic regression also shows well for predicting under-five mortality with accuracy (94% to 95%)., AUROC range (93.4% to 94.8%), and precision-recall curve (99.5% to 99.6%). The number of living children, survival time, wealth index, child size at birth, birth in the last five years, the total number of children ever born, mother’s education level, and birth order were identified as important factors influencing under-five mortality. CONCLUSION: The neural network model was a better predictive model compared to other machine learning models in predicting under-five mortality, but logistic regression analysis also shows good results. These models may be helpful for the analysis of high-dimensional data for health research. BioMed Central 2022-09-24 /pmc/articles/PMC9509654/ /pubmed/36153553 http://dx.doi.org/10.1186/s13040-022-00308-8 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology
Saroj, Rakesh Kumar
Yadav, Pawan Kumar
Singh, Rajneesh
Chilyabanyama, Obvious.N.
Machine Learning Algorithms for understanding the determinants of under-five Mortality
title Machine Learning Algorithms for understanding the determinants of under-five Mortality
title_full Machine Learning Algorithms for understanding the determinants of under-five Mortality
title_fullStr Machine Learning Algorithms for understanding the determinants of under-five Mortality
title_full_unstemmed Machine Learning Algorithms for understanding the determinants of under-five Mortality
title_short Machine Learning Algorithms for understanding the determinants of under-five Mortality
title_sort machine learning algorithms for understanding the determinants of under-five mortality
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9509654/
https://www.ncbi.nlm.nih.gov/pubmed/36153553
http://dx.doi.org/10.1186/s13040-022-00308-8
work_keys_str_mv AT sarojrakeshkumar machinelearningalgorithmsforunderstandingthedeterminantsofunderfivemortality
AT yadavpawankumar machinelearningalgorithmsforunderstandingthedeterminantsofunderfivemortality
AT singhrajneesh machinelearningalgorithmsforunderstandingthedeterminantsofunderfivemortality
AT chilyabanyamaobviousn machinelearningalgorithmsforunderstandingthedeterminantsofunderfivemortality