Cargando…
Neonatal mortality prediction with routinely collected data: a machine learning approach
BACKGROUND: Recent decreases in neonatal mortality have been slower than expected for most countries. This study aims to predict the risk of neonatal mortality using only data routinely available from birth records in the largest city of the Americas. METHODS: A probabilistic linkage of every birth...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8293479/ https://www.ncbi.nlm.nih.gov/pubmed/34289819 http://dx.doi.org/10.1186/s12887-021-02788-9 |
_version_ | 1783725047379132416 |
---|---|
author | Batista, André F. M. Diniz, Carmen S. G. Bonilha, Eliana A. Kawachi, Ichiro Chiavegatto Filho, Alexandre D. P. |
author_facet | Batista, André F. M. Diniz, Carmen S. G. Bonilha, Eliana A. Kawachi, Ichiro Chiavegatto Filho, Alexandre D. P. |
author_sort | Batista, André F. M. |
collection | PubMed |
description | BACKGROUND: Recent decreases in neonatal mortality have been slower than expected for most countries. This study aims to predict the risk of neonatal mortality using only data routinely available from birth records in the largest city of the Americas. METHODS: A probabilistic linkage of every birth record occurring in the municipality of São Paulo, Brazil, between 2012 e 2017 was performed with the death records from 2012 to 2018 (1,202,843 births and 447,687 deaths), and a total of 7282 neonatal deaths were identified (a neonatal mortality rate of 6.46 per 1000 live births). Births from 2012 and 2016 (N = 941,308; or 83.44% of the total) were used to train five different machine learning algorithms, while births occurring in 2017 (N = 186,854; or 16.56% of the total) were used to test their predictive performance on new unseen data. RESULTS: The best performance was obtained by the extreme gradient boosting trees (XGBoost) algorithm, with a very high AUC of 0.97 and F1-score of 0.55. The 5% births with the highest predicted risk of neonatal death included more than 90% of the actual neonatal deaths. On the other hand, there were no deaths among the 5% births with the lowest predicted risk. There were no significant differences in predictive performance for vulnerable subgroups. The use of a smaller number of variables (WHO’s five minimum perinatal indicators) decreased overall performance but the results still remained high (AUC of 0.91). With the addition of only three more variables, we achieved the same predictive performance (AUC of 0.97) as using all the 23 variables originally available from the Brazilian birth records. CONCLUSION: Machine learning algorithms were able to identify with very high predictive performance the neonatal mortality risk of newborns using only routinely collected data. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12887-021-02788-9. |
format | Online Article Text |
id | pubmed-8293479 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-82934792021-07-21 Neonatal mortality prediction with routinely collected data: a machine learning approach Batista, André F. M. Diniz, Carmen S. G. Bonilha, Eliana A. Kawachi, Ichiro Chiavegatto Filho, Alexandre D. P. BMC Pediatr Research BACKGROUND: Recent decreases in neonatal mortality have been slower than expected for most countries. This study aims to predict the risk of neonatal mortality using only data routinely available from birth records in the largest city of the Americas. METHODS: A probabilistic linkage of every birth record occurring in the municipality of São Paulo, Brazil, between 2012 e 2017 was performed with the death records from 2012 to 2018 (1,202,843 births and 447,687 deaths), and a total of 7282 neonatal deaths were identified (a neonatal mortality rate of 6.46 per 1000 live births). Births from 2012 and 2016 (N = 941,308; or 83.44% of the total) were used to train five different machine learning algorithms, while births occurring in 2017 (N = 186,854; or 16.56% of the total) were used to test their predictive performance on new unseen data. RESULTS: The best performance was obtained by the extreme gradient boosting trees (XGBoost) algorithm, with a very high AUC of 0.97 and F1-score of 0.55. The 5% births with the highest predicted risk of neonatal death included more than 90% of the actual neonatal deaths. On the other hand, there were no deaths among the 5% births with the lowest predicted risk. There were no significant differences in predictive performance for vulnerable subgroups. The use of a smaller number of variables (WHO’s five minimum perinatal indicators) decreased overall performance but the results still remained high (AUC of 0.91). With the addition of only three more variables, we achieved the same predictive performance (AUC of 0.97) as using all the 23 variables originally available from the Brazilian birth records. CONCLUSION: Machine learning algorithms were able to identify with very high predictive performance the neonatal mortality risk of newborns using only routinely collected data. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12887-021-02788-9. BioMed Central 2021-07-21 /pmc/articles/PMC8293479/ /pubmed/34289819 http://dx.doi.org/10.1186/s12887-021-02788-9 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Batista, André F. M. Diniz, Carmen S. G. Bonilha, Eliana A. Kawachi, Ichiro Chiavegatto Filho, Alexandre D. P. Neonatal mortality prediction with routinely collected data: a machine learning approach |
title | Neonatal mortality prediction with routinely collected data: a machine learning approach |
title_full | Neonatal mortality prediction with routinely collected data: a machine learning approach |
title_fullStr | Neonatal mortality prediction with routinely collected data: a machine learning approach |
title_full_unstemmed | Neonatal mortality prediction with routinely collected data: a machine learning approach |
title_short | Neonatal mortality prediction with routinely collected data: a machine learning approach |
title_sort | neonatal mortality prediction with routinely collected data: a machine learning approach |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8293479/ https://www.ncbi.nlm.nih.gov/pubmed/34289819 http://dx.doi.org/10.1186/s12887-021-02788-9 |
work_keys_str_mv | AT batistaandrefm neonatalmortalitypredictionwithroutinelycollecteddataamachinelearningapproach AT dinizcarmensg neonatalmortalitypredictionwithroutinelycollecteddataamachinelearningapproach AT bonilhaelianaa neonatalmortalitypredictionwithroutinelycollecteddataamachinelearningapproach AT kawachiichiro neonatalmortalitypredictionwithroutinelycollecteddataamachinelearningapproach AT chiavegattofilhoalexandredp neonatalmortalitypredictionwithroutinelycollecteddataamachinelearningapproach |