Cargando…

Comparison of Multivariable Logistic Regression and Machine Learning Models for Predicting Bronchopulmonary Dysplasia or Death in Very Preterm Infants

Bronchopulmonary dysplasia (BPD) is the most prevalent and clinically significant complication of prematurity. Accurate identification of at-risk infants would enable ongoing intervention to improve outcomes. Although postnatal exposures are known to affect an infant's likelihood of developing...

Descripción completa

Detalles Bibliográficos
Autores principales: Khurshid, Faiza, Coo, Helen, Khalil, Amal, Messiha, Jonathan, Ting, Joseph Y., Wong, Jonathan, Shah, Prakesh S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8688959/
https://www.ncbi.nlm.nih.gov/pubmed/34950616
http://dx.doi.org/10.3389/fped.2021.759776
_version_ 1784618457955827712
author Khurshid, Faiza
Coo, Helen
Khalil, Amal
Messiha, Jonathan
Ting, Joseph Y.
Wong, Jonathan
Shah, Prakesh S.
author_facet Khurshid, Faiza
Coo, Helen
Khalil, Amal
Messiha, Jonathan
Ting, Joseph Y.
Wong, Jonathan
Shah, Prakesh S.
author_sort Khurshid, Faiza
collection PubMed
description Bronchopulmonary dysplasia (BPD) is the most prevalent and clinically significant complication of prematurity. Accurate identification of at-risk infants would enable ongoing intervention to improve outcomes. Although postnatal exposures are known to affect an infant's likelihood of developing BPD, most existing BPD prediction models do not allow risk to be evaluated at different time points, and/or are not suitable for use in ethno-diverse populations. A comprehensive approach to developing clinical prediction models avoids assumptions as to which method will yield the optimal results by testing multiple algorithms/models. We compared the performance of machine learning and logistic regression models in predicting BPD/death. Our main cohort included infants <33 weeks' gestational age (GA) admitted to a Canadian Neonatal Network site from 2016 to 2018 (n = 9,006) with all analyses repeated for the <29 weeks' GA subcohort (n = 4,246). Models were developed to predict, on days 1, 7, and 14 of admission to neonatal intensive care, the composite outcome of BPD/death prior to discharge. Ten-fold cross-validation and a 20% hold-out sample were used to measure area under the curve (AUC). Calibration intercepts and slopes were estimated by regressing the outcome on the log-odds of the predicted probabilities. The model AUCs ranged from 0.811 to 0.886. Model discrimination was lower in the <29 weeks' GA subcohort (AUCs 0.699–0.790). Several machine learning models had a suboptimal calibration intercept and/or slope (k-nearest neighbor, random forest, artificial neural network, stacking neural network ensemble). The top-performing algorithms will be used to develop multinomial models and an online risk estimator for predicting BPD severity and death that does not require information on ethnicity.
format Online
Article
Text
id pubmed-8688959
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-86889592021-12-22 Comparison of Multivariable Logistic Regression and Machine Learning Models for Predicting Bronchopulmonary Dysplasia or Death in Very Preterm Infants Khurshid, Faiza Coo, Helen Khalil, Amal Messiha, Jonathan Ting, Joseph Y. Wong, Jonathan Shah, Prakesh S. Front Pediatr Pediatrics Bronchopulmonary dysplasia (BPD) is the most prevalent and clinically significant complication of prematurity. Accurate identification of at-risk infants would enable ongoing intervention to improve outcomes. Although postnatal exposures are known to affect an infant's likelihood of developing BPD, most existing BPD prediction models do not allow risk to be evaluated at different time points, and/or are not suitable for use in ethno-diverse populations. A comprehensive approach to developing clinical prediction models avoids assumptions as to which method will yield the optimal results by testing multiple algorithms/models. We compared the performance of machine learning and logistic regression models in predicting BPD/death. Our main cohort included infants <33 weeks' gestational age (GA) admitted to a Canadian Neonatal Network site from 2016 to 2018 (n = 9,006) with all analyses repeated for the <29 weeks' GA subcohort (n = 4,246). Models were developed to predict, on days 1, 7, and 14 of admission to neonatal intensive care, the composite outcome of BPD/death prior to discharge. Ten-fold cross-validation and a 20% hold-out sample were used to measure area under the curve (AUC). Calibration intercepts and slopes were estimated by regressing the outcome on the log-odds of the predicted probabilities. The model AUCs ranged from 0.811 to 0.886. Model discrimination was lower in the <29 weeks' GA subcohort (AUCs 0.699–0.790). Several machine learning models had a suboptimal calibration intercept and/or slope (k-nearest neighbor, random forest, artificial neural network, stacking neural network ensemble). The top-performing algorithms will be used to develop multinomial models and an online risk estimator for predicting BPD severity and death that does not require information on ethnicity. Frontiers Media S.A. 2021-12-07 /pmc/articles/PMC8688959/ /pubmed/34950616 http://dx.doi.org/10.3389/fped.2021.759776 Text en Copyright © 2021 Khurshid, Coo, Khalil, Messiha, Ting, Wong and Shah. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Pediatrics
Khurshid, Faiza
Coo, Helen
Khalil, Amal
Messiha, Jonathan
Ting, Joseph Y.
Wong, Jonathan
Shah, Prakesh S.
Comparison of Multivariable Logistic Regression and Machine Learning Models for Predicting Bronchopulmonary Dysplasia or Death in Very Preterm Infants
title Comparison of Multivariable Logistic Regression and Machine Learning Models for Predicting Bronchopulmonary Dysplasia or Death in Very Preterm Infants
title_full Comparison of Multivariable Logistic Regression and Machine Learning Models for Predicting Bronchopulmonary Dysplasia or Death in Very Preterm Infants
title_fullStr Comparison of Multivariable Logistic Regression and Machine Learning Models for Predicting Bronchopulmonary Dysplasia or Death in Very Preterm Infants
title_full_unstemmed Comparison of Multivariable Logistic Regression and Machine Learning Models for Predicting Bronchopulmonary Dysplasia or Death in Very Preterm Infants
title_short Comparison of Multivariable Logistic Regression and Machine Learning Models for Predicting Bronchopulmonary Dysplasia or Death in Very Preterm Infants
title_sort comparison of multivariable logistic regression and machine learning models for predicting bronchopulmonary dysplasia or death in very preterm infants
topic Pediatrics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8688959/
https://www.ncbi.nlm.nih.gov/pubmed/34950616
http://dx.doi.org/10.3389/fped.2021.759776
work_keys_str_mv AT khurshidfaiza comparisonofmultivariablelogisticregressionandmachinelearningmodelsforpredictingbronchopulmonarydysplasiaordeathinverypreterminfants
AT coohelen comparisonofmultivariablelogisticregressionandmachinelearningmodelsforpredictingbronchopulmonarydysplasiaordeathinverypreterminfants
AT khalilamal comparisonofmultivariablelogisticregressionandmachinelearningmodelsforpredictingbronchopulmonarydysplasiaordeathinverypreterminfants
AT messihajonathan comparisonofmultivariablelogisticregressionandmachinelearningmodelsforpredictingbronchopulmonarydysplasiaordeathinverypreterminfants
AT tingjosephy comparisonofmultivariablelogisticregressionandmachinelearningmodelsforpredictingbronchopulmonarydysplasiaordeathinverypreterminfants
AT wongjonathan comparisonofmultivariablelogisticregressionandmachinelearningmodelsforpredictingbronchopulmonarydysplasiaordeathinverypreterminfants
AT shahprakeshs comparisonofmultivariablelogisticregressionandmachinelearningmodelsforpredictingbronchopulmonarydysplasiaordeathinverypreterminfants