Cargando…
Machine learning predictive modelling for identification of predictors of acute respiratory infection and diarrhoea in Uganda’s rural and urban settings
Despite the widely known preventive interventions, the dyad of acute respiratory infections (ARI) and diarrhoea remain among the top global causes of mortality in under– 5 years. Studies on child morbidity have enormously applied “traditional” statistical techniques that have limitations in handling...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10021828/ https://www.ncbi.nlm.nih.gov/pubmed/36962243 http://dx.doi.org/10.1371/journal.pgph.0000430 |
_version_ | 1784908590053588992 |
---|---|
author | Kananura, Rornald Muhumuza |
author_facet | Kananura, Rornald Muhumuza |
author_sort | Kananura, Rornald Muhumuza |
collection | PubMed |
description | Despite the widely known preventive interventions, the dyad of acute respiratory infections (ARI) and diarrhoea remain among the top global causes of mortality in under– 5 years. Studies on child morbidity have enormously applied “traditional” statistical techniques that have limitations in handling high dimension data, which leads to the exclusion of some variables. Machine Learning (ML) models appear to perform better on high dimension data (dataset with the number of features p (usually correlated) larger than the number of observations N). Using Uganda’s 2006–2016 DHS pooled data on children aged 6–59 months, I applied ML techniques to identify rural-urban differentials in the predictors of child’s diarrhoea and ARI. I also used ML to identify other omitted variables in the current child morbidity frameworks. The predictors were grouped into four categories: child characteristics, maternal characteristics, household characteristics and immunisation. I used 90% of the datasets as a training sets (dataset used to fit (train) a prediction model), which were tested or validated (dataset (pseudo new) used for evaluating the performance of the model on a new dataset) on 10% and 30% datasets. The measure of prediction was based on a 10-fold cross-validation (resampling technique). The gradient-boosted machine (ML technique) was the best-selected model for the identification of the predictors of ARI (Accuracy: 100% -rural and 100%-urban) and diarrhoea (Accuracy: 70%-rural and 100%-urban). These factors relate to the household’s structure and composition, which is characterised by poor hygiene and sanitation and poor household environments that make children more suspectable of developing these diseases; maternal socio-economic factors such as education, occupation, and fertility (birth order); individual risk factors such as child age, birth weight and nutritional status; and protective interventions (immunisation). The study findings confirm the notion that ARI and diarrhoea risk factors overlap. The results highlight the need for a holistic approach with multisectoral emphasis in addressing the occurrence of ARI and diarrhoea among children. In particular, the results provide an insight into the importance of implementing interventions that are responsive to the unique structure and composition of the household. Finally, alongside traditional models, machine learning could be applied in generating research hypotheses and providing insight into the selection of key variables that should be considered in the model. |
format | Online Article Text |
id | pubmed-10021828 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-100218282023-03-17 Machine learning predictive modelling for identification of predictors of acute respiratory infection and diarrhoea in Uganda’s rural and urban settings Kananura, Rornald Muhumuza PLOS Glob Public Health Research Article Despite the widely known preventive interventions, the dyad of acute respiratory infections (ARI) and diarrhoea remain among the top global causes of mortality in under– 5 years. Studies on child morbidity have enormously applied “traditional” statistical techniques that have limitations in handling high dimension data, which leads to the exclusion of some variables. Machine Learning (ML) models appear to perform better on high dimension data (dataset with the number of features p (usually correlated) larger than the number of observations N). Using Uganda’s 2006–2016 DHS pooled data on children aged 6–59 months, I applied ML techniques to identify rural-urban differentials in the predictors of child’s diarrhoea and ARI. I also used ML to identify other omitted variables in the current child morbidity frameworks. The predictors were grouped into four categories: child characteristics, maternal characteristics, household characteristics and immunisation. I used 90% of the datasets as a training sets (dataset used to fit (train) a prediction model), which were tested or validated (dataset (pseudo new) used for evaluating the performance of the model on a new dataset) on 10% and 30% datasets. The measure of prediction was based on a 10-fold cross-validation (resampling technique). The gradient-boosted machine (ML technique) was the best-selected model for the identification of the predictors of ARI (Accuracy: 100% -rural and 100%-urban) and diarrhoea (Accuracy: 70%-rural and 100%-urban). These factors relate to the household’s structure and composition, which is characterised by poor hygiene and sanitation and poor household environments that make children more suspectable of developing these diseases; maternal socio-economic factors such as education, occupation, and fertility (birth order); individual risk factors such as child age, birth weight and nutritional status; and protective interventions (immunisation). The study findings confirm the notion that ARI and diarrhoea risk factors overlap. The results highlight the need for a holistic approach with multisectoral emphasis in addressing the occurrence of ARI and diarrhoea among children. In particular, the results provide an insight into the importance of implementing interventions that are responsive to the unique structure and composition of the household. Finally, alongside traditional models, machine learning could be applied in generating research hypotheses and providing insight into the selection of key variables that should be considered in the model. Public Library of Science 2022-05-11 /pmc/articles/PMC10021828/ /pubmed/36962243 http://dx.doi.org/10.1371/journal.pgph.0000430 Text en © 2022 Rornald Muhumuza Kananura https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Kananura, Rornald Muhumuza Machine learning predictive modelling for identification of predictors of acute respiratory infection and diarrhoea in Uganda’s rural and urban settings |
title | Machine learning predictive modelling for identification of predictors of acute respiratory infection and diarrhoea in Uganda’s rural and urban settings |
title_full | Machine learning predictive modelling for identification of predictors of acute respiratory infection and diarrhoea in Uganda’s rural and urban settings |
title_fullStr | Machine learning predictive modelling for identification of predictors of acute respiratory infection and diarrhoea in Uganda’s rural and urban settings |
title_full_unstemmed | Machine learning predictive modelling for identification of predictors of acute respiratory infection and diarrhoea in Uganda’s rural and urban settings |
title_short | Machine learning predictive modelling for identification of predictors of acute respiratory infection and diarrhoea in Uganda’s rural and urban settings |
title_sort | machine learning predictive modelling for identification of predictors of acute respiratory infection and diarrhoea in uganda’s rural and urban settings |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10021828/ https://www.ncbi.nlm.nih.gov/pubmed/36962243 http://dx.doi.org/10.1371/journal.pgph.0000430 |
work_keys_str_mv | AT kananurarornaldmuhumuza machinelearningpredictivemodellingforidentificationofpredictorsofacuterespiratoryinfectionanddiarrhoeainugandasruralandurbansettings |