Cargando…

Automating data extraction in systematic reviews: a systematic review

BACKGROUND: Automation of the parts of systematic review process, specifically the data extraction step, may be an important strategy to reduce the time necessary to complete a systematic review. However, the state of the science of automatically extracting data elements from full texts has not been...

Descripción completa

Detalles Bibliográficos
Autores principales: Jonnalagadda, Siddhartha R., Goyal, Pawan, Huffman, Mark D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4514954/
https://www.ncbi.nlm.nih.gov/pubmed/26073888
http://dx.doi.org/10.1186/s13643-015-0066-7
_version_ 1782382844873539584
author Jonnalagadda, Siddhartha R.
Goyal, Pawan
Huffman, Mark D.
author_facet Jonnalagadda, Siddhartha R.
Goyal, Pawan
Huffman, Mark D.
author_sort Jonnalagadda, Siddhartha R.
collection PubMed
description BACKGROUND: Automation of the parts of systematic review process, specifically the data extraction step, may be an important strategy to reduce the time necessary to complete a systematic review. However, the state of the science of automatically extracting data elements from full texts has not been well described. This paper performs a systematic review of published and unpublished methods to automate data extraction for systematic reviews. METHODS: We systematically searched PubMed, IEEEXplore, and ACM Digital Library to identify potentially relevant articles. We included reports that met the following criteria: 1) methods or results section described what entities were or need to be extracted, and 2) at least one entity was automatically extracted with evaluation results that were presented for that entity. We also reviewed the citations from included reports. RESULTS: Out of a total of 1190 unique citations that met our search criteria, we found 26 published reports describing automatic extraction of at least one of more than 52 potential data elements used in systematic reviews. For 25 (48 %) of the data elements used in systematic reviews, there were attempts from various researchers to extract information automatically from the publication text. Out of these, 14 (27 %) data elements were completely extracted, but the highest number of data elements extracted automatically by a single study was 7. Most of the data elements were extracted with F-scores (a mean of sensitivity and positive predictive value) of over 70 %. CONCLUSIONS: We found no unified information extraction framework tailored to the systematic review process, and published reports focused on a limited (1–7) number of data elements. Biomedical natural language processing techniques have not been fully utilized to fully or even partially automate the data extraction step of systematic reviews.
format Online
Article
Text
id pubmed-4514954
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-45149542015-07-26 Automating data extraction in systematic reviews: a systematic review Jonnalagadda, Siddhartha R. Goyal, Pawan Huffman, Mark D. Syst Rev Research BACKGROUND: Automation of the parts of systematic review process, specifically the data extraction step, may be an important strategy to reduce the time necessary to complete a systematic review. However, the state of the science of automatically extracting data elements from full texts has not been well described. This paper performs a systematic review of published and unpublished methods to automate data extraction for systematic reviews. METHODS: We systematically searched PubMed, IEEEXplore, and ACM Digital Library to identify potentially relevant articles. We included reports that met the following criteria: 1) methods or results section described what entities were or need to be extracted, and 2) at least one entity was automatically extracted with evaluation results that were presented for that entity. We also reviewed the citations from included reports. RESULTS: Out of a total of 1190 unique citations that met our search criteria, we found 26 published reports describing automatic extraction of at least one of more than 52 potential data elements used in systematic reviews. For 25 (48 %) of the data elements used in systematic reviews, there were attempts from various researchers to extract information automatically from the publication text. Out of these, 14 (27 %) data elements were completely extracted, but the highest number of data elements extracted automatically by a single study was 7. Most of the data elements were extracted with F-scores (a mean of sensitivity and positive predictive value) of over 70 %. CONCLUSIONS: We found no unified information extraction framework tailored to the systematic review process, and published reports focused on a limited (1–7) number of data elements. Biomedical natural language processing techniques have not been fully utilized to fully or even partially automate the data extraction step of systematic reviews. BioMed Central 2015-06-15 /pmc/articles/PMC4514954/ /pubmed/26073888 http://dx.doi.org/10.1186/s13643-015-0066-7 Text en © Jonnalagadda et al. 2015 This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Jonnalagadda, Siddhartha R.
Goyal, Pawan
Huffman, Mark D.
Automating data extraction in systematic reviews: a systematic review
title Automating data extraction in systematic reviews: a systematic review
title_full Automating data extraction in systematic reviews: a systematic review
title_fullStr Automating data extraction in systematic reviews: a systematic review
title_full_unstemmed Automating data extraction in systematic reviews: a systematic review
title_short Automating data extraction in systematic reviews: a systematic review
title_sort automating data extraction in systematic reviews: a systematic review
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4514954/
https://www.ncbi.nlm.nih.gov/pubmed/26073888
http://dx.doi.org/10.1186/s13643-015-0066-7
work_keys_str_mv AT jonnalagaddasiddharthar automatingdataextractioninsystematicreviewsasystematicreview
AT goyalpawan automatingdataextractioninsystematicreviewsasystematicreview
AT huffmanmarkd automatingdataextractioninsystematicreviewsasystematicreview