Cargando…

Semi-automated screening of biomedical citations for systematic reviews

BACKGROUND: Systematic reviews address a specific clinical question by unbiasedly assessing and analyzing the pertinent literature. Citation screening is a time-consuming and critical step in systematic reviews. Typically, reviewers must evaluate thousands of citations to identify articles eligible...

Descripción completa

Detalles Bibliográficos
Autores principales: Wallace, Byron C, Trikalinos, Thomas A, Lau, Joseph, Brodley, Carla, Schmid, Christopher H
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2824679/
https://www.ncbi.nlm.nih.gov/pubmed/20102628
http://dx.doi.org/10.1186/1471-2105-11-55
_version_ 1782177716311687168
author Wallace, Byron C
Trikalinos, Thomas A
Lau, Joseph
Brodley, Carla
Schmid, Christopher H
author_facet Wallace, Byron C
Trikalinos, Thomas A
Lau, Joseph
Brodley, Carla
Schmid, Christopher H
author_sort Wallace, Byron C
collection PubMed
description BACKGROUND: Systematic reviews address a specific clinical question by unbiasedly assessing and analyzing the pertinent literature. Citation screening is a time-consuming and critical step in systematic reviews. Typically, reviewers must evaluate thousands of citations to identify articles eligible for a given review. We explore the application of machine learning techniques to semi-automate citation screening, thereby reducing the reviewers' workload. RESULTS: We present a novel online classification strategy for citation screening to automatically discriminate "relevant" from "irrelevant" citations. We use an ensemble of Support Vector Machines (SVMs) built over different feature-spaces (e.g., abstract and title text), and trained interactively by the reviewer(s). Semi-automating the citation screening process is difficult because any such strategy must identify all citations eligible for the systematic review. This requirement is made harder still due to class imbalance; there are far fewer "relevant" than "irrelevant" citations for any given systematic review. To address these challenges we employ a custom active-learning strategy developed specifically for imbalanced datasets. Further, we introduce a novel undersampling technique. We provide experimental results over three real-world systematic review datasets, and demonstrate that our algorithm is able to reduce the number of citations that must be screened manually by nearly half in two of these, and by around 40% in the third, without excluding any of the citations eligible for the systematic review. CONCLUSIONS: We have developed a semi-automated citation screening algorithm for systematic reviews that has the potential to substantially reduce the number of citations reviewers have to manually screen, without compromising the quality and comprehensiveness of the review.
format Text
id pubmed-2824679
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-28246792010-02-19 Semi-automated screening of biomedical citations for systematic reviews Wallace, Byron C Trikalinos, Thomas A Lau, Joseph Brodley, Carla Schmid, Christopher H BMC Bioinformatics Research article BACKGROUND: Systematic reviews address a specific clinical question by unbiasedly assessing and analyzing the pertinent literature. Citation screening is a time-consuming and critical step in systematic reviews. Typically, reviewers must evaluate thousands of citations to identify articles eligible for a given review. We explore the application of machine learning techniques to semi-automate citation screening, thereby reducing the reviewers' workload. RESULTS: We present a novel online classification strategy for citation screening to automatically discriminate "relevant" from "irrelevant" citations. We use an ensemble of Support Vector Machines (SVMs) built over different feature-spaces (e.g., abstract and title text), and trained interactively by the reviewer(s). Semi-automating the citation screening process is difficult because any such strategy must identify all citations eligible for the systematic review. This requirement is made harder still due to class imbalance; there are far fewer "relevant" than "irrelevant" citations for any given systematic review. To address these challenges we employ a custom active-learning strategy developed specifically for imbalanced datasets. Further, we introduce a novel undersampling technique. We provide experimental results over three real-world systematic review datasets, and demonstrate that our algorithm is able to reduce the number of citations that must be screened manually by nearly half in two of these, and by around 40% in the third, without excluding any of the citations eligible for the systematic review. CONCLUSIONS: We have developed a semi-automated citation screening algorithm for systematic reviews that has the potential to substantially reduce the number of citations reviewers have to manually screen, without compromising the quality and comprehensiveness of the review. BioMed Central 2010-01-26 /pmc/articles/PMC2824679/ /pubmed/20102628 http://dx.doi.org/10.1186/1471-2105-11-55 Text en Copyright ©2010 Wallace et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research article
Wallace, Byron C
Trikalinos, Thomas A
Lau, Joseph
Brodley, Carla
Schmid, Christopher H
Semi-automated screening of biomedical citations for systematic reviews
title Semi-automated screening of biomedical citations for systematic reviews
title_full Semi-automated screening of biomedical citations for systematic reviews
title_fullStr Semi-automated screening of biomedical citations for systematic reviews
title_full_unstemmed Semi-automated screening of biomedical citations for systematic reviews
title_short Semi-automated screening of biomedical citations for systematic reviews
title_sort semi-automated screening of biomedical citations for systematic reviews
topic Research article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2824679/
https://www.ncbi.nlm.nih.gov/pubmed/20102628
http://dx.doi.org/10.1186/1471-2105-11-55
work_keys_str_mv AT wallacebyronc semiautomatedscreeningofbiomedicalcitationsforsystematicreviews
AT trikalinosthomasa semiautomatedscreeningofbiomedicalcitationsforsystematicreviews
AT laujoseph semiautomatedscreeningofbiomedicalcitationsforsystematicreviews
AT brodleycarla semiautomatedscreeningofbiomedicalcitationsforsystematicreviews
AT schmidchristopherh semiautomatedscreeningofbiomedicalcitationsforsystematicreviews