Cargando…

Machine learning for identifying Randomized Controlled Trials: An evaluation and practitioner's guide

Machine learning (ML) algorithms have proven highly accurate for identifying Randomized Controlled Trials (RCTs) but are not used much in practice, in part because the best way to make use of the technology in a typical workflow is unclear. In this work, we evaluate ML models for RCT classification...

Descripción completa

Detalles Bibliográficos
Autores principales:	Marshall, Iain J., Noel‐Storr, Anna, Kuiper, Joël, Thomas, James, Wallace, Byron C.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	John Wiley and Sons Inc. 2018
Materias:	Special Issue Papers
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6030513/ https://www.ncbi.nlm.nih.gov/pubmed/29314757 http://dx.doi.org/10.1002/jrsm.1287

_version_	1783337154777186304
author	Marshall, Iain J. Noel‐Storr, Anna Kuiper, Joël Thomas, James Wallace, Byron C.
author_facet	Marshall, Iain J. Noel‐Storr, Anna Kuiper, Joël Thomas, James Wallace, Byron C.
author_sort	Marshall, Iain J.
collection	PubMed
description	Machine learning (ML) algorithms have proven highly accurate for identifying Randomized Controlled Trials (RCTs) but are not used much in practice, in part because the best way to make use of the technology in a typical workflow is unclear. In this work, we evaluate ML models for RCT classification (support vector machines, convolutional neural networks, and ensemble approaches). We trained and optimized support vector machine and convolutional neural network models on the titles and abstracts of the Cochrane Crowd RCT set. We evaluated the models on an external dataset (Clinical Hedges), allowing direct comparison with traditional database search filters. We estimated area under receiver operating characteristics (AUROC) using the Clinical Hedges dataset. We demonstrate that ML approaches better discriminate between RCTs and non‐RCTs than widely used traditional database search filters at all sensitivity levels; our best‐performing model also achieved the best results to date for ML in this task (AUROC 0.987, 95% CI, 0.984‐0.989). We provide practical guidance on the role of ML in (1) systematic reviews (high‐sensitivity strategies) and (2) rapid reviews and clinical question answering (high‐precision strategies) together with recommended probability cutoffs for each use case. Finally, we provide open‐source software to enable these approaches to be used in practice.
format	Online Article Text
id	pubmed-6030513
institution	National Center for Biotechnology Information
language	English
publishDate	2018
publisher	John Wiley and Sons Inc.
record_format	MEDLINE/PubMed
spelling	pubmed-60305132018-12-07 Machine learning for identifying Randomized Controlled Trials: An evaluation and practitioner's guide Marshall, Iain J. Noel‐Storr, Anna Kuiper, Joël Thomas, James Wallace, Byron C. Res Synth Methods Special Issue Papers Machine learning (ML) algorithms have proven highly accurate for identifying Randomized Controlled Trials (RCTs) but are not used much in practice, in part because the best way to make use of the technology in a typical workflow is unclear. In this work, we evaluate ML models for RCT classification (support vector machines, convolutional neural networks, and ensemble approaches). We trained and optimized support vector machine and convolutional neural network models on the titles and abstracts of the Cochrane Crowd RCT set. We evaluated the models on an external dataset (Clinical Hedges), allowing direct comparison with traditional database search filters. We estimated area under receiver operating characteristics (AUROC) using the Clinical Hedges dataset. We demonstrate that ML approaches better discriminate between RCTs and non‐RCTs than widely used traditional database search filters at all sensitivity levels; our best‐performing model also achieved the best results to date for ML in this task (AUROC 0.987, 95% CI, 0.984‐0.989). We provide practical guidance on the role of ML in (1) systematic reviews (high‐sensitivity strategies) and (2) rapid reviews and clinical question answering (high‐precision strategies) together with recommended probability cutoffs for each use case. Finally, we provide open‐source software to enable these approaches to be used in practice. John Wiley and Sons Inc. 2018-02-07 2018-12 /pmc/articles/PMC6030513/ /pubmed/29314757 http://dx.doi.org/10.1002/jrsm.1287 Text en © 2018 The Authors. Research Synthesis Methods published by John Wiley & Sons Ltd. This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Special Issue Papers Marshall, Iain J. Noel‐Storr, Anna Kuiper, Joël Thomas, James Wallace, Byron C. Machine learning for identifying Randomized Controlled Trials: An evaluation and practitioner's guide
title	Machine learning for identifying Randomized Controlled Trials: An evaluation and practitioner's guide
title_full	Machine learning for identifying Randomized Controlled Trials: An evaluation and practitioner's guide
title_fullStr	Machine learning for identifying Randomized Controlled Trials: An evaluation and practitioner's guide
title_full_unstemmed	Machine learning for identifying Randomized Controlled Trials: An evaluation and practitioner's guide
title_short	Machine learning for identifying Randomized Controlled Trials: An evaluation and practitioner's guide
title_sort	machine learning for identifying randomized controlled trials: an evaluation and practitioner's guide
topic	Special Issue Papers
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6030513/ https://www.ncbi.nlm.nih.gov/pubmed/29314757 http://dx.doi.org/10.1002/jrsm.1287
work_keys_str_mv	AT marshalliainj machinelearningforidentifyingrandomizedcontrolledtrialsanevaluationandpractitionersguide AT noelstorranna machinelearningforidentifyingrandomizedcontrolledtrialsanevaluationandpractitionersguide AT kuiperjoel machinelearningforidentifyingrandomizedcontrolledtrialsanevaluationandpractitionersguide AT thomasjames machinelearningforidentifyingrandomizedcontrolledtrialsanevaluationandpractitionersguide AT wallacebyronc machinelearningforidentifyingrandomizedcontrolledtrialsanevaluationandpractitionersguide

Machine learning for identifying Randomized Controlled Trials: An evaluation and practitioner's guide

Ejemplares similares