Cargando…

Assessing the accuracy of machine-assisted abstract screening with DistillerAI: a user study

BACKGROUND: Web applications that employ natural language processing technologies to support systematic reviewers during abstract screening have become more common. The goal of our project was to conduct a case study to explore a screening approach that temporarily replaces a human screener with a s...

Descripción completa

Detalles Bibliográficos
Autores principales:	Gartlehner, Gerald, Wagner, Gernot, Lux, Linda, Affengruber, Lisa, Dobrescu, Andreea, Kaminski-Hartenthaler, Angela, Viswanathan, Meera
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2019
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6857277/ https://www.ncbi.nlm.nih.gov/pubmed/31727159 http://dx.doi.org/10.1186/s13643-019-1221-3

_version_	1783470734617608192
author	Gartlehner, Gerald Wagner, Gernot Lux, Linda Affengruber, Lisa Dobrescu, Andreea Kaminski-Hartenthaler, Angela Viswanathan, Meera
author_facet	Gartlehner, Gerald Wagner, Gernot Lux, Linda Affengruber, Lisa Dobrescu, Andreea Kaminski-Hartenthaler, Angela Viswanathan, Meera
author_sort	Gartlehner, Gerald
collection	PubMed
description	BACKGROUND: Web applications that employ natural language processing technologies to support systematic reviewers during abstract screening have become more common. The goal of our project was to conduct a case study to explore a screening approach that temporarily replaces a human screener with a semi-automated screening tool. METHODS: We evaluated the accuracy of the approach using DistillerAI as a semi-automated screening tool. A published comparative effectiveness review served as the reference standard. Five teams of professional systematic reviewers screened the same 2472 abstracts in parallel. Each team trained DistillerAI with 300 randomly selected abstracts that the team screened dually. For all remaining abstracts, DistillerAI replaced one human screener and provided predictions about the relevance of records. A single reviewer also screened all remaining abstracts. A second human screener resolved conflicts between the single reviewer and DistillerAI. We compared the decisions of the machine-assisted approach, single-reviewer screening, and screening with DistillerAI alone against the reference standard. RESULTS: The combined sensitivity of the machine-assisted screening approach across the five screening teams was 78% (95% confidence interval [CI], 66 to 90%), and the combined specificity was 95% (95% CI, 92 to 97%). By comparison, the sensitivity of single-reviewer screening was similar (78%; 95% CI, 66 to 89%); however, the sensitivity of DistillerAI alone was substantially worse (14%; 95% CI, 0 to 31%) than that of the machine-assisted screening approach. Specificities for single-reviewer screening and DistillerAI were 94% (95% CI, 91 to 97%) and 98% (95% CI, 97 to 100%), respectively. Machine-assisted screening and single-reviewer screening had similar areas under the curve (0.87 and 0.86, respectively); by contrast, the area under the curve for DistillerAI alone was just slightly better than chance (0.56). The interrater agreement between human screeners and DistillerAI with a prevalence-adjusted kappa was 0.85 (95% CI, 0.84 to 0.86%). CONCLUSIONS: The accuracy of DistillerAI is not yet adequate to replace a human screener temporarily during abstract screening for systematic reviews. Rapid reviews, which do not require detecting the totality of the relevant evidence, may find semi-automation tools to have greater utility than traditional systematic reviews.
format	Online Article Text
id	pubmed-6857277
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-68572772019-12-05 Assessing the accuracy of machine-assisted abstract screening with DistillerAI: a user study Gartlehner, Gerald Wagner, Gernot Lux, Linda Affengruber, Lisa Dobrescu, Andreea Kaminski-Hartenthaler, Angela Viswanathan, Meera Syst Rev Research BACKGROUND: Web applications that employ natural language processing technologies to support systematic reviewers during abstract screening have become more common. The goal of our project was to conduct a case study to explore a screening approach that temporarily replaces a human screener with a semi-automated screening tool. METHODS: We evaluated the accuracy of the approach using DistillerAI as a semi-automated screening tool. A published comparative effectiveness review served as the reference standard. Five teams of professional systematic reviewers screened the same 2472 abstracts in parallel. Each team trained DistillerAI with 300 randomly selected abstracts that the team screened dually. For all remaining abstracts, DistillerAI replaced one human screener and provided predictions about the relevance of records. A single reviewer also screened all remaining abstracts. A second human screener resolved conflicts between the single reviewer and DistillerAI. We compared the decisions of the machine-assisted approach, single-reviewer screening, and screening with DistillerAI alone against the reference standard. RESULTS: The combined sensitivity of the machine-assisted screening approach across the five screening teams was 78% (95% confidence interval [CI], 66 to 90%), and the combined specificity was 95% (95% CI, 92 to 97%). By comparison, the sensitivity of single-reviewer screening was similar (78%; 95% CI, 66 to 89%); however, the sensitivity of DistillerAI alone was substantially worse (14%; 95% CI, 0 to 31%) than that of the machine-assisted screening approach. Specificities for single-reviewer screening and DistillerAI were 94% (95% CI, 91 to 97%) and 98% (95% CI, 97 to 100%), respectively. Machine-assisted screening and single-reviewer screening had similar areas under the curve (0.87 and 0.86, respectively); by contrast, the area under the curve for DistillerAI alone was just slightly better than chance (0.56). The interrater agreement between human screeners and DistillerAI with a prevalence-adjusted kappa was 0.85 (95% CI, 0.84 to 0.86%). CONCLUSIONS: The accuracy of DistillerAI is not yet adequate to replace a human screener temporarily during abstract screening for systematic reviews. Rapid reviews, which do not require detecting the totality of the relevant evidence, may find semi-automation tools to have greater utility than traditional systematic reviews. BioMed Central 2019-11-15 /pmc/articles/PMC6857277/ /pubmed/31727159 http://dx.doi.org/10.1186/s13643-019-1221-3 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Gartlehner, Gerald Wagner, Gernot Lux, Linda Affengruber, Lisa Dobrescu, Andreea Kaminski-Hartenthaler, Angela Viswanathan, Meera Assessing the accuracy of machine-assisted abstract screening with DistillerAI: a user study
title	Assessing the accuracy of machine-assisted abstract screening with DistillerAI: a user study
title_full	Assessing the accuracy of machine-assisted abstract screening with DistillerAI: a user study
title_fullStr	Assessing the accuracy of machine-assisted abstract screening with DistillerAI: a user study
title_full_unstemmed	Assessing the accuracy of machine-assisted abstract screening with DistillerAI: a user study
title_short	Assessing the accuracy of machine-assisted abstract screening with DistillerAI: a user study
title_sort	assessing the accuracy of machine-assisted abstract screening with distillerai: a user study
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6857277/ https://www.ncbi.nlm.nih.gov/pubmed/31727159 http://dx.doi.org/10.1186/s13643-019-1221-3
work_keys_str_mv	AT gartlehnergerald assessingtheaccuracyofmachineassistedabstractscreeningwithdistilleraiauserstudy AT wagnergernot assessingtheaccuracyofmachineassistedabstractscreeningwithdistilleraiauserstudy AT luxlinda assessingtheaccuracyofmachineassistedabstractscreeningwithdistilleraiauserstudy AT affengruberlisa assessingtheaccuracyofmachineassistedabstractscreeningwithdistilleraiauserstudy AT dobrescuandreea assessingtheaccuracyofmachineassistedabstractscreeningwithdistilleraiauserstudy AT kaminskihartenthalerangela assessingtheaccuracyofmachineassistedabstractscreeningwithdistilleraiauserstudy AT viswanathanmeera assessingtheaccuracyofmachineassistedabstractscreeningwithdistilleraiauserstudy

Assessing the accuracy of machine-assisted abstract screening with DistillerAI: a user study

Ejemplares similares