Cargando…
Comparison of Multivariable Logistic Regression and Machine Learning Models for Predicting Bronchopulmonary Dysplasia or Death in Very Preterm Infants
Bronchopulmonary dysplasia (BPD) is the most prevalent and clinically significant complication of prematurity. Accurate identification of at-risk infants would enable ongoing intervention to improve outcomes. Although postnatal exposures are known to affect an infant's likelihood of developing...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8688959/ https://www.ncbi.nlm.nih.gov/pubmed/34950616 http://dx.doi.org/10.3389/fped.2021.759776 |
_version_ | 1784618457955827712 |
---|---|
author | Khurshid, Faiza Coo, Helen Khalil, Amal Messiha, Jonathan Ting, Joseph Y. Wong, Jonathan Shah, Prakesh S. |
author_facet | Khurshid, Faiza Coo, Helen Khalil, Amal Messiha, Jonathan Ting, Joseph Y. Wong, Jonathan Shah, Prakesh S. |
author_sort | Khurshid, Faiza |
collection | PubMed |
description | Bronchopulmonary dysplasia (BPD) is the most prevalent and clinically significant complication of prematurity. Accurate identification of at-risk infants would enable ongoing intervention to improve outcomes. Although postnatal exposures are known to affect an infant's likelihood of developing BPD, most existing BPD prediction models do not allow risk to be evaluated at different time points, and/or are not suitable for use in ethno-diverse populations. A comprehensive approach to developing clinical prediction models avoids assumptions as to which method will yield the optimal results by testing multiple algorithms/models. We compared the performance of machine learning and logistic regression models in predicting BPD/death. Our main cohort included infants <33 weeks' gestational age (GA) admitted to a Canadian Neonatal Network site from 2016 to 2018 (n = 9,006) with all analyses repeated for the <29 weeks' GA subcohort (n = 4,246). Models were developed to predict, on days 1, 7, and 14 of admission to neonatal intensive care, the composite outcome of BPD/death prior to discharge. Ten-fold cross-validation and a 20% hold-out sample were used to measure area under the curve (AUC). Calibration intercepts and slopes were estimated by regressing the outcome on the log-odds of the predicted probabilities. The model AUCs ranged from 0.811 to 0.886. Model discrimination was lower in the <29 weeks' GA subcohort (AUCs 0.699–0.790). Several machine learning models had a suboptimal calibration intercept and/or slope (k-nearest neighbor, random forest, artificial neural network, stacking neural network ensemble). The top-performing algorithms will be used to develop multinomial models and an online risk estimator for predicting BPD severity and death that does not require information on ethnicity. |
format | Online Article Text |
id | pubmed-8688959 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-86889592021-12-22 Comparison of Multivariable Logistic Regression and Machine Learning Models for Predicting Bronchopulmonary Dysplasia or Death in Very Preterm Infants Khurshid, Faiza Coo, Helen Khalil, Amal Messiha, Jonathan Ting, Joseph Y. Wong, Jonathan Shah, Prakesh S. Front Pediatr Pediatrics Bronchopulmonary dysplasia (BPD) is the most prevalent and clinically significant complication of prematurity. Accurate identification of at-risk infants would enable ongoing intervention to improve outcomes. Although postnatal exposures are known to affect an infant's likelihood of developing BPD, most existing BPD prediction models do not allow risk to be evaluated at different time points, and/or are not suitable for use in ethno-diverse populations. A comprehensive approach to developing clinical prediction models avoids assumptions as to which method will yield the optimal results by testing multiple algorithms/models. We compared the performance of machine learning and logistic regression models in predicting BPD/death. Our main cohort included infants <33 weeks' gestational age (GA) admitted to a Canadian Neonatal Network site from 2016 to 2018 (n = 9,006) with all analyses repeated for the <29 weeks' GA subcohort (n = 4,246). Models were developed to predict, on days 1, 7, and 14 of admission to neonatal intensive care, the composite outcome of BPD/death prior to discharge. Ten-fold cross-validation and a 20% hold-out sample were used to measure area under the curve (AUC). Calibration intercepts and slopes were estimated by regressing the outcome on the log-odds of the predicted probabilities. The model AUCs ranged from 0.811 to 0.886. Model discrimination was lower in the <29 weeks' GA subcohort (AUCs 0.699–0.790). Several machine learning models had a suboptimal calibration intercept and/or slope (k-nearest neighbor, random forest, artificial neural network, stacking neural network ensemble). The top-performing algorithms will be used to develop multinomial models and an online risk estimator for predicting BPD severity and death that does not require information on ethnicity. Frontiers Media S.A. 2021-12-07 /pmc/articles/PMC8688959/ /pubmed/34950616 http://dx.doi.org/10.3389/fped.2021.759776 Text en Copyright © 2021 Khurshid, Coo, Khalil, Messiha, Ting, Wong and Shah. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Pediatrics Khurshid, Faiza Coo, Helen Khalil, Amal Messiha, Jonathan Ting, Joseph Y. Wong, Jonathan Shah, Prakesh S. Comparison of Multivariable Logistic Regression and Machine Learning Models for Predicting Bronchopulmonary Dysplasia or Death in Very Preterm Infants |
title | Comparison of Multivariable Logistic Regression and Machine Learning Models for Predicting Bronchopulmonary Dysplasia or Death in Very Preterm Infants |
title_full | Comparison of Multivariable Logistic Regression and Machine Learning Models for Predicting Bronchopulmonary Dysplasia or Death in Very Preterm Infants |
title_fullStr | Comparison of Multivariable Logistic Regression and Machine Learning Models for Predicting Bronchopulmonary Dysplasia or Death in Very Preterm Infants |
title_full_unstemmed | Comparison of Multivariable Logistic Regression and Machine Learning Models for Predicting Bronchopulmonary Dysplasia or Death in Very Preterm Infants |
title_short | Comparison of Multivariable Logistic Regression and Machine Learning Models for Predicting Bronchopulmonary Dysplasia or Death in Very Preterm Infants |
title_sort | comparison of multivariable logistic regression and machine learning models for predicting bronchopulmonary dysplasia or death in very preterm infants |
topic | Pediatrics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8688959/ https://www.ncbi.nlm.nih.gov/pubmed/34950616 http://dx.doi.org/10.3389/fped.2021.759776 |
work_keys_str_mv | AT khurshidfaiza comparisonofmultivariablelogisticregressionandmachinelearningmodelsforpredictingbronchopulmonarydysplasiaordeathinverypreterminfants AT coohelen comparisonofmultivariablelogisticregressionandmachinelearningmodelsforpredictingbronchopulmonarydysplasiaordeathinverypreterminfants AT khalilamal comparisonofmultivariablelogisticregressionandmachinelearningmodelsforpredictingbronchopulmonarydysplasiaordeathinverypreterminfants AT messihajonathan comparisonofmultivariablelogisticregressionandmachinelearningmodelsforpredictingbronchopulmonarydysplasiaordeathinverypreterminfants AT tingjosephy comparisonofmultivariablelogisticregressionandmachinelearningmodelsforpredictingbronchopulmonarydysplasiaordeathinverypreterminfants AT wongjonathan comparisonofmultivariablelogisticregressionandmachinelearningmodelsforpredictingbronchopulmonarydysplasiaordeathinverypreterminfants AT shahprakeshs comparisonofmultivariablelogisticregressionandmachinelearningmodelsforpredictingbronchopulmonarydysplasiaordeathinverypreterminfants |