Cargando…

Risk of bias in studies on prediction models developed using supervised machine learning techniques: systematic review

OBJECTIVE: To assess the methodological quality of studies on prediction models developed using machine learning techniques across all medical specialties. DESIGN: Systematic review. DATA SOURCES: PubMed from 1 January 2018 to 31 December 2019. ELIGIBILITY CRITERIA: Articles reporting on the develop...

Descripción completa

Detalles Bibliográficos
Autores principales:	Andaur Navarro, Constanza L, Damen, Johanna A A, Takada, Toshihiko, Nijman, Steven W J, Dhiman, Paula, Ma, Jie, Collins, Gary S, Bajpai, Ram, Riley, Richard D, Moons, Karel G M, Hooft, Lotty
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BMJ Publishing Group Ltd. 2021
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8527348/ https://www.ncbi.nlm.nih.gov/pubmed/34670780 http://dx.doi.org/10.1136/bmj.n2281

_version_	1784586057548824576
author	Andaur Navarro, Constanza L Damen, Johanna A A Takada, Toshihiko Nijman, Steven W J Dhiman, Paula Ma, Jie Collins, Gary S Bajpai, Ram Riley, Richard D Moons, Karel G M Hooft, Lotty
author_facet	Andaur Navarro, Constanza L Damen, Johanna A A Takada, Toshihiko Nijman, Steven W J Dhiman, Paula Ma, Jie Collins, Gary S Bajpai, Ram Riley, Richard D Moons, Karel G M Hooft, Lotty
author_sort	Andaur Navarro, Constanza L
collection	PubMed
description	OBJECTIVE: To assess the methodological quality of studies on prediction models developed using machine learning techniques across all medical specialties. DESIGN: Systematic review. DATA SOURCES: PubMed from 1 January 2018 to 31 December 2019. ELIGIBILITY CRITERIA: Articles reporting on the development, with or without external validation, of a multivariable prediction model (diagnostic or prognostic) developed using supervised machine learning for individualised predictions. No restrictions applied for study design, data source, or predicted patient related health outcomes. REVIEW METHODS: Methodological quality of the studies was determined and risk of bias evaluated using the prediction risk of bias assessment tool (PROBAST). This tool contains 21 signalling questions tailored to identify potential biases in four domains. Risk of bias was measured for each domain (participants, predictors, outcome, and analysis) and each study (overall). RESULTS: 152 studies were included: 58 (38%) included a diagnostic prediction model and 94 (62%) a prognostic prediction model. PROBAST was applied to 152 developed models and 19 external validations. Of these 171 analyses, 148 (87%, 95% confidence interval 81% to 91%) were rated at high risk of bias. The analysis domain was most frequently rated at high risk of bias. Of the 152 models, 85 (56%, 48% to 64%) were developed with an inadequate number of events per candidate predictor, 62 handled missing data inadequately (41%, 33% to 49%), and 59 assessed overfitting improperly (39%, 31% to 47%). Most models used appropriate data sources to develop (73%, 66% to 79%) and externally validate the machine learning based prediction models (74%, 51% to 88%). Information about blinding of outcome and blinding of predictors was, however, absent in 60 (40%, 32% to 47%) and 79 (52%, 44% to 60%) of the developed models, respectively. CONCLUSION: Most studies on machine learning based prediction models show poor methodological quality and are at high risk of bias. Factors contributing to risk of bias include small study size, poor handling of missing data, and failure to deal with overfitting. Efforts to improve the design, conduct, reporting, and validation of such studies are necessary to boost the application of machine learning based prediction models in clinical practice. SYSTEMATIC REVIEW REGISTRATION: PROSPERO CRD42019161764.
format	Online Article Text
id	pubmed-8527348
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	BMJ Publishing Group Ltd.
record_format	MEDLINE/PubMed
spelling	pubmed-85273482021-11-04 Risk of bias in studies on prediction models developed using supervised machine learning techniques: systematic review Andaur Navarro, Constanza L Damen, Johanna A A Takada, Toshihiko Nijman, Steven W J Dhiman, Paula Ma, Jie Collins, Gary S Bajpai, Ram Riley, Richard D Moons, Karel G M Hooft, Lotty BMJ Research OBJECTIVE: To assess the methodological quality of studies on prediction models developed using machine learning techniques across all medical specialties. DESIGN: Systematic review. DATA SOURCES: PubMed from 1 January 2018 to 31 December 2019. ELIGIBILITY CRITERIA: Articles reporting on the development, with or without external validation, of a multivariable prediction model (diagnostic or prognostic) developed using supervised machine learning for individualised predictions. No restrictions applied for study design, data source, or predicted patient related health outcomes. REVIEW METHODS: Methodological quality of the studies was determined and risk of bias evaluated using the prediction risk of bias assessment tool (PROBAST). This tool contains 21 signalling questions tailored to identify potential biases in four domains. Risk of bias was measured for each domain (participants, predictors, outcome, and analysis) and each study (overall). RESULTS: 152 studies were included: 58 (38%) included a diagnostic prediction model and 94 (62%) a prognostic prediction model. PROBAST was applied to 152 developed models and 19 external validations. Of these 171 analyses, 148 (87%, 95% confidence interval 81% to 91%) were rated at high risk of bias. The analysis domain was most frequently rated at high risk of bias. Of the 152 models, 85 (56%, 48% to 64%) were developed with an inadequate number of events per candidate predictor, 62 handled missing data inadequately (41%, 33% to 49%), and 59 assessed overfitting improperly (39%, 31% to 47%). Most models used appropriate data sources to develop (73%, 66% to 79%) and externally validate the machine learning based prediction models (74%, 51% to 88%). Information about blinding of outcome and blinding of predictors was, however, absent in 60 (40%, 32% to 47%) and 79 (52%, 44% to 60%) of the developed models, respectively. CONCLUSION: Most studies on machine learning based prediction models show poor methodological quality and are at high risk of bias. Factors contributing to risk of bias include small study size, poor handling of missing data, and failure to deal with overfitting. Efforts to improve the design, conduct, reporting, and validation of such studies are necessary to boost the application of machine learning based prediction models in clinical practice. SYSTEMATIC REVIEW REGISTRATION: PROSPERO CRD42019161764. BMJ Publishing Group Ltd. 2021-10-20 /pmc/articles/PMC8527348/ /pubmed/34670780 http://dx.doi.org/10.1136/bmj.n2281 Text en © Author(s) (or their employer(s)) 2019. Re-use permitted under CC BY. No commercial re-use. See rights and permissions. Published by BMJ. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed in accordance with the terms of the Creative Commons Attribution (CC BY 4.0) license, which permits others to distribute, remix, adapt and build upon this work, for commercial use, provided the original work is properly cited. See: http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle	Research Andaur Navarro, Constanza L Damen, Johanna A A Takada, Toshihiko Nijman, Steven W J Dhiman, Paula Ma, Jie Collins, Gary S Bajpai, Ram Riley, Richard D Moons, Karel G M Hooft, Lotty Risk of bias in studies on prediction models developed using supervised machine learning techniques: systematic review
title	Risk of bias in studies on prediction models developed using supervised machine learning techniques: systematic review
title_full	Risk of bias in studies on prediction models developed using supervised machine learning techniques: systematic review
title_fullStr	Risk of bias in studies on prediction models developed using supervised machine learning techniques: systematic review
title_full_unstemmed	Risk of bias in studies on prediction models developed using supervised machine learning techniques: systematic review
title_short	Risk of bias in studies on prediction models developed using supervised machine learning techniques: systematic review
title_sort	risk of bias in studies on prediction models developed using supervised machine learning techniques: systematic review
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8527348/ https://www.ncbi.nlm.nih.gov/pubmed/34670780 http://dx.doi.org/10.1136/bmj.n2281
work_keys_str_mv	AT andaurnavarroconstanzal riskofbiasinstudiesonpredictionmodelsdevelopedusingsupervisedmachinelearningtechniquessystematicreview AT damenjohannaaa riskofbiasinstudiesonpredictionmodelsdevelopedusingsupervisedmachinelearningtechniquessystematicreview AT takadatoshihiko riskofbiasinstudiesonpredictionmodelsdevelopedusingsupervisedmachinelearningtechniquessystematicreview AT nijmanstevenwj riskofbiasinstudiesonpredictionmodelsdevelopedusingsupervisedmachinelearningtechniquessystematicreview AT dhimanpaula riskofbiasinstudiesonpredictionmodelsdevelopedusingsupervisedmachinelearningtechniquessystematicreview AT majie riskofbiasinstudiesonpredictionmodelsdevelopedusingsupervisedmachinelearningtechniquessystematicreview AT collinsgarys riskofbiasinstudiesonpredictionmodelsdevelopedusingsupervisedmachinelearningtechniquessystematicreview AT bajpairam riskofbiasinstudiesonpredictionmodelsdevelopedusingsupervisedmachinelearningtechniquessystematicreview AT rileyrichardd riskofbiasinstudiesonpredictionmodelsdevelopedusingsupervisedmachinelearningtechniquessystematicreview AT moonskarelgm riskofbiasinstudiesonpredictionmodelsdevelopedusingsupervisedmachinelearningtechniquessystematicreview AT hooftlotty riskofbiasinstudiesonpredictionmodelsdevelopedusingsupervisedmachinelearningtechniquessystematicreview

Risk of bias in studies on prediction models developed using supervised machine learning techniques: systematic review

Ejemplares similares