Cargando…

Automated confidence ranked classification of randomized controlled trial articles: an aid to evidence-based medicine

Objective: For many literature review tasks, including systematic review (SR) and other aspects of evidence-based medicine, it is important to know whether an article describes a randomized controlled trial (RCT). Current manual annotation is not complete or flexible enough for the SR process. In th...

Descripción completa

Detalles Bibliográficos
Autores principales:	Cohen, Aaron M, Smalheiser, Neil R, McDonagh, Marian S, Yu, Clement, Adams, Clive E, Davis, John M, Yu, Philip S
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2015
Materias:	Research and Applications
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4457112/ https://www.ncbi.nlm.nih.gov/pubmed/25656516 http://dx.doi.org/10.1093/jamia/ocu025

_version_	1782374939833139200
author	Cohen, Aaron M Smalheiser, Neil R McDonagh, Marian S Yu, Clement Adams, Clive E Davis, John M Yu, Philip S
author_facet	Cohen, Aaron M Smalheiser, Neil R McDonagh, Marian S Yu, Clement Adams, Clive E Davis, John M Yu, Philip S
author_sort	Cohen, Aaron M
collection	PubMed
description	Objective: For many literature review tasks, including systematic review (SR) and other aspects of evidence-based medicine, it is important to know whether an article describes a randomized controlled trial (RCT). Current manual annotation is not complete or flexible enough for the SR process. In this work, highly accurate machine learning predictive models were built that include confidence predictions of whether an article is an RCT. Materials and Methods: The LibSVM classifier was used with forward selection of potential feature sets on a large human-related subset of MEDLINE to create a classification model requiring only the citation, abstract, and MeSH terms for each article. Results: The model achieved an area under the receiver operating characteristic curve of 0.973 and mean squared error of 0.013 on the held out year 2011 data. Accurate confidence estimates were confirmed on a manually reviewed set of test articles. A second model not requiring MeSH terms was also created, and performs almost as well. Discussion: Both models accurately rank and predict article RCT confidence. Using the model and the manually reviewed samples, it is estimated that about 8000 (3%) additional RCTs can be identified in MEDLINE, and that 5% of articles tagged as RCTs in Medline may not be identified. Conclusion: Retagging human-related studies with a continuously valued RCT confidence is potentially more useful for article ranking and review than a simple yes/no prediction. The automated RCT tagging tool should offer significant savings of time and effort during the process of writing SRs, and is a key component of a multistep text mining pipeline that we are building to streamline SR workflow. In addition, the model may be useful for identifying errors in MEDLINE publication types. The RCT confidence predictions described here have been made available to users as a web service with a user query form front end at: http://arrowsmith.psych.uic.edu/cgi-bin/arrowsmith_uic/RCT_Tagger.cgi.
format	Online Article Text
id	pubmed-4457112
institution	National Center for Biotechnology Information
language	English
publishDate	2015
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-44571122016-05-01 Automated confidence ranked classification of randomized controlled trial articles: an aid to evidence-based medicine Cohen, Aaron M Smalheiser, Neil R McDonagh, Marian S Yu, Clement Adams, Clive E Davis, John M Yu, Philip S J Am Med Inform Assoc Research and Applications Objective: For many literature review tasks, including systematic review (SR) and other aspects of evidence-based medicine, it is important to know whether an article describes a randomized controlled trial (RCT). Current manual annotation is not complete or flexible enough for the SR process. In this work, highly accurate machine learning predictive models were built that include confidence predictions of whether an article is an RCT. Materials and Methods: The LibSVM classifier was used with forward selection of potential feature sets on a large human-related subset of MEDLINE to create a classification model requiring only the citation, abstract, and MeSH terms for each article. Results: The model achieved an area under the receiver operating characteristic curve of 0.973 and mean squared error of 0.013 on the held out year 2011 data. Accurate confidence estimates were confirmed on a manually reviewed set of test articles. A second model not requiring MeSH terms was also created, and performs almost as well. Discussion: Both models accurately rank and predict article RCT confidence. Using the model and the manually reviewed samples, it is estimated that about 8000 (3%) additional RCTs can be identified in MEDLINE, and that 5% of articles tagged as RCTs in Medline may not be identified. Conclusion: Retagging human-related studies with a continuously valued RCT confidence is potentially more useful for article ranking and review than a simple yes/no prediction. The automated RCT tagging tool should offer significant savings of time and effort during the process of writing SRs, and is a key component of a multistep text mining pipeline that we are building to streamline SR workflow. In addition, the model may be useful for identifying errors in MEDLINE publication types. The RCT confidence predictions described here have been made available to users as a web service with a user query form front end at: http://arrowsmith.psych.uic.edu/cgi-bin/arrowsmith_uic/RCT_Tagger.cgi. Oxford University Press 2015-05 2015-02-05 /pmc/articles/PMC4457112/ /pubmed/25656516 http://dx.doi.org/10.1093/jamia/ocu025 Text en © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. http://creativecommons.org/licenses/by-nc/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle	Research and Applications Cohen, Aaron M Smalheiser, Neil R McDonagh, Marian S Yu, Clement Adams, Clive E Davis, John M Yu, Philip S Automated confidence ranked classification of randomized controlled trial articles: an aid to evidence-based medicine
title	Automated confidence ranked classification of randomized controlled trial articles: an aid to evidence-based medicine
title_full	Automated confidence ranked classification of randomized controlled trial articles: an aid to evidence-based medicine
title_fullStr	Automated confidence ranked classification of randomized controlled trial articles: an aid to evidence-based medicine
title_full_unstemmed	Automated confidence ranked classification of randomized controlled trial articles: an aid to evidence-based medicine
title_short	Automated confidence ranked classification of randomized controlled trial articles: an aid to evidence-based medicine
title_sort	automated confidence ranked classification of randomized controlled trial articles: an aid to evidence-based medicine
topic	Research and Applications
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4457112/ https://www.ncbi.nlm.nih.gov/pubmed/25656516 http://dx.doi.org/10.1093/jamia/ocu025
work_keys_str_mv	AT cohenaaronm automatedconfidencerankedclassificationofrandomizedcontrolledtrialarticlesanaidtoevidencebasedmedicine AT smalheiserneilr automatedconfidencerankedclassificationofrandomizedcontrolledtrialarticlesanaidtoevidencebasedmedicine AT mcdonaghmarians automatedconfidencerankedclassificationofrandomizedcontrolledtrialarticlesanaidtoevidencebasedmedicine AT yuclement automatedconfidencerankedclassificationofrandomizedcontrolledtrialarticlesanaidtoevidencebasedmedicine AT adamsclivee automatedconfidencerankedclassificationofrandomizedcontrolledtrialarticlesanaidtoevidencebasedmedicine AT davisjohnm automatedconfidencerankedclassificationofrandomizedcontrolledtrialarticlesanaidtoevidencebasedmedicine AT yuphilips automatedconfidencerankedclassificationofrandomizedcontrolledtrialarticlesanaidtoevidencebasedmedicine

Automated confidence ranked classification of randomized controlled trial articles: an aid to evidence-based medicine

Ejemplares similares