Cargando…

Machine learning for screening prioritization in systematic reviews: comparative performance of Abstrackr and EPPI-Reviewer

BACKGROUND: Improving the speed of systematic review (SR) development is key to supporting evidence-based medicine. Machine learning tools which semi-automate citation screening might improve efficiency. Few studies have assessed use of screening prioritization functionality or compared two tools he...

Descripción completa

Detalles Bibliográficos
Autores principales:	Tsou, Amy Y., Treadwell, Jonathan R., Erinoff, Eileen, Schoelles, Karen
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2020
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7118839/ https://www.ncbi.nlm.nih.gov/pubmed/32241297 http://dx.doi.org/10.1186/s13643-020-01324-7

_version_	1783514645499215872
author	Tsou, Amy Y. Treadwell, Jonathan R. Erinoff, Eileen Schoelles, Karen
author_facet	Tsou, Amy Y. Treadwell, Jonathan R. Erinoff, Eileen Schoelles, Karen
author_sort	Tsou, Amy Y.
collection	PubMed
description	BACKGROUND: Improving the speed of systematic review (SR) development is key to supporting evidence-based medicine. Machine learning tools which semi-automate citation screening might improve efficiency. Few studies have assessed use of screening prioritization functionality or compared two tools head to head. In this project, we compared performance of two machine-learning tools for potential use in citation screening. METHODS: Using 9 evidence reports previously completed by the ECRI Institute Evidence-based Practice Center team, we compared performance of Abstrackr and EPPI-Reviewer, two off-the-shelf citations screening tools, for identifying relevant citations. Screening prioritization functionality was tested for 3 large reports and 6 small reports on a range of clinical topics. Large report topics were imaging for pancreatic cancer, indoor allergen reduction, and inguinal hernia repair. We trained Abstrackr and EPPI-Reviewer and screened all citations in 10% increments. In Task 1, we inputted whether an abstract was ordered for full-text screening; in Task 2, we inputted whether an abstract was included in the final report. For both tasks, screening continued until all studies ordered and included for the actual reports were identified. We assessed potential reductions in hypothetical screening burden (proportion of citations screened to identify all included studies) offered by each tool for all 9 reports. RESULTS: For the 3 large reports, both EPPI-Reviewer and Abstrackr performed well with potential reductions in screening burden of 4 to 49% (Abstrackr) and 9 to 60% (EPPI-Reviewer). Both tools had markedly poorer performance for 1 large report (inguinal hernia), possibly due to its heterogeneous key questions. Based on McNemar’s test for paired proportions in the 3 large reports, EPPI-Reviewer outperformed Abstrackr for identifying articles ordered for full-text review, but Abstrackr performed better in 2 of 3 reports for identifying articles included in the final report. For small reports, both tools provided benefits but EPPI-Reviewer generally outperformed Abstrackr in both tasks, although these results were often not statistically significant. CONCLUSIONS: Abstrackr and EPPI-Reviewer performed well, but prioritization accuracy varied greatly across reports. Our work suggests screening prioritization functionality is a promising modality offering efficiency gains without giving up human involvement in the screening process.
format	Online Article Text
id	pubmed-7118839
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-71188392020-04-07 Machine learning for screening prioritization in systematic reviews: comparative performance of Abstrackr and EPPI-Reviewer Tsou, Amy Y. Treadwell, Jonathan R. Erinoff, Eileen Schoelles, Karen Syst Rev Research BACKGROUND: Improving the speed of systematic review (SR) development is key to supporting evidence-based medicine. Machine learning tools which semi-automate citation screening might improve efficiency. Few studies have assessed use of screening prioritization functionality or compared two tools head to head. In this project, we compared performance of two machine-learning tools for potential use in citation screening. METHODS: Using 9 evidence reports previously completed by the ECRI Institute Evidence-based Practice Center team, we compared performance of Abstrackr and EPPI-Reviewer, two off-the-shelf citations screening tools, for identifying relevant citations. Screening prioritization functionality was tested for 3 large reports and 6 small reports on a range of clinical topics. Large report topics were imaging for pancreatic cancer, indoor allergen reduction, and inguinal hernia repair. We trained Abstrackr and EPPI-Reviewer and screened all citations in 10% increments. In Task 1, we inputted whether an abstract was ordered for full-text screening; in Task 2, we inputted whether an abstract was included in the final report. For both tasks, screening continued until all studies ordered and included for the actual reports were identified. We assessed potential reductions in hypothetical screening burden (proportion of citations screened to identify all included studies) offered by each tool for all 9 reports. RESULTS: For the 3 large reports, both EPPI-Reviewer and Abstrackr performed well with potential reductions in screening burden of 4 to 49% (Abstrackr) and 9 to 60% (EPPI-Reviewer). Both tools had markedly poorer performance for 1 large report (inguinal hernia), possibly due to its heterogeneous key questions. Based on McNemar’s test for paired proportions in the 3 large reports, EPPI-Reviewer outperformed Abstrackr for identifying articles ordered for full-text review, but Abstrackr performed better in 2 of 3 reports for identifying articles included in the final report. For small reports, both tools provided benefits but EPPI-Reviewer generally outperformed Abstrackr in both tasks, although these results were often not statistically significant. CONCLUSIONS: Abstrackr and EPPI-Reviewer performed well, but prioritization accuracy varied greatly across reports. Our work suggests screening prioritization functionality is a promising modality offering efficiency gains without giving up human involvement in the screening process. BioMed Central 2020-04-02 /pmc/articles/PMC7118839/ /pubmed/32241297 http://dx.doi.org/10.1186/s13643-020-01324-7 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle	Research Tsou, Amy Y. Treadwell, Jonathan R. Erinoff, Eileen Schoelles, Karen Machine learning for screening prioritization in systematic reviews: comparative performance of Abstrackr and EPPI-Reviewer
title	Machine learning for screening prioritization in systematic reviews: comparative performance of Abstrackr and EPPI-Reviewer
title_full	Machine learning for screening prioritization in systematic reviews: comparative performance of Abstrackr and EPPI-Reviewer
title_fullStr	Machine learning for screening prioritization in systematic reviews: comparative performance of Abstrackr and EPPI-Reviewer
title_full_unstemmed	Machine learning for screening prioritization in systematic reviews: comparative performance of Abstrackr and EPPI-Reviewer
title_short	Machine learning for screening prioritization in systematic reviews: comparative performance of Abstrackr and EPPI-Reviewer
title_sort	machine learning for screening prioritization in systematic reviews: comparative performance of abstrackr and eppi-reviewer
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7118839/ https://www.ncbi.nlm.nih.gov/pubmed/32241297 http://dx.doi.org/10.1186/s13643-020-01324-7
work_keys_str_mv	AT tsouamyy machinelearningforscreeningprioritizationinsystematicreviewscomparativeperformanceofabstrackrandeppireviewer AT treadwelljonathanr machinelearningforscreeningprioritizationinsystematicreviewscomparativeperformanceofabstrackrandeppireviewer AT erinoffeileen machinelearningforscreeningprioritizationinsystematicreviewscomparativeperformanceofabstrackrandeppireviewer AT schoelleskaren machinelearningforscreeningprioritizationinsystematicreviewscomparativeperformanceofabstrackrandeppireviewer

Machine learning for screening prioritization in systematic reviews: comparative performance of Abstrackr and EPPI-Reviewer

Ejemplares similares