Cargando…

Data extraction methods for systematic review (semi)automation: Update of a living systematic review

Background: The reliable and usable (semi)automation of data extraction can support the field of systematic review by reducing the workload required to gather information about the conduct and results of the included studies. This living systematic review examines published approaches for data extra...

Descripción completa

Detalles Bibliográficos
Autores principales: Schmidt, Lena, Finnerty Mutlu, Ailbhe N., Elmore, Rebecca, Olorisade, Babatunde K., Thomas, James, Higgins, Julian P. T.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: F1000 Research Limited 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8361807/
https://www.ncbi.nlm.nih.gov/pubmed/34408850
http://dx.doi.org/10.12688/f1000research.51117.2
_version_ 1783738023557464064
author Schmidt, Lena
Finnerty Mutlu, Ailbhe N.
Elmore, Rebecca
Olorisade, Babatunde K.
Thomas, James
Higgins, Julian P. T.
author_facet Schmidt, Lena
Finnerty Mutlu, Ailbhe N.
Elmore, Rebecca
Olorisade, Babatunde K.
Thomas, James
Higgins, Julian P. T.
author_sort Schmidt, Lena
collection PubMed
description Background: The reliable and usable (semi)automation of data extraction can support the field of systematic review by reducing the workload required to gather information about the conduct and results of the included studies. This living systematic review examines published approaches for data extraction from reports of clinical studies. Methods: We systematically and continually search PubMed, ACL Anthology, arXiv, OpenAlex via EPPI-Reviewer, and the  dblp computer science bibliography. Full text screening and data extraction are conducted within an open-source living systematic review application created for the purpose of this review. This living review update includes publications up to December 2022 and OpenAlex content up to March 2023. Results: 76 publications are included in this review. Of these, 64 (84%) of the publications addressed extraction of data from abstracts, while 19 (25%) used full texts. A total of 71 (93%) publications developed classifiers for randomised controlled trials. Over 30 entities were extracted, with PICOs (population, intervention, comparator, outcome) being the most frequently extracted. Data are available from 25 (33%), and code from 30 (39%) publications. Six (8%) implemented publicly available tools Conclusions: This living systematic review presents an overview of (semi)automated data-extraction literature of interest to different types of literature review. We identified a broad evidence base of publications describing data extraction for interventional reviews and a small number of publications extracting epidemiological or diagnostic accuracy data. Between review updates, trends for sharing data and code increased strongly: in the base-review, data and code were available for 13 and 19% respectively, these numbers increased to 78 and 87% within the 23 new publications. Compared with the base-review, we observed another research trend, away from straightforward data extraction and towards additionally extracting relations between entities or automatic text summarisation. With this living review we aim to review the literature continually.
format Online
Article
Text
id pubmed-8361807
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher F1000 Research Limited
record_format MEDLINE/PubMed
spelling pubmed-83618072021-08-17 Data extraction methods for systematic review (semi)automation: Update of a living systematic review Schmidt, Lena Finnerty Mutlu, Ailbhe N. Elmore, Rebecca Olorisade, Babatunde K. Thomas, James Higgins, Julian P. T. F1000Res Systematic Review Background: The reliable and usable (semi)automation of data extraction can support the field of systematic review by reducing the workload required to gather information about the conduct and results of the included studies. This living systematic review examines published approaches for data extraction from reports of clinical studies. Methods: We systematically and continually search PubMed, ACL Anthology, arXiv, OpenAlex via EPPI-Reviewer, and the  dblp computer science bibliography. Full text screening and data extraction are conducted within an open-source living systematic review application created for the purpose of this review. This living review update includes publications up to December 2022 and OpenAlex content up to March 2023. Results: 76 publications are included in this review. Of these, 64 (84%) of the publications addressed extraction of data from abstracts, while 19 (25%) used full texts. A total of 71 (93%) publications developed classifiers for randomised controlled trials. Over 30 entities were extracted, with PICOs (population, intervention, comparator, outcome) being the most frequently extracted. Data are available from 25 (33%), and code from 30 (39%) publications. Six (8%) implemented publicly available tools Conclusions: This living systematic review presents an overview of (semi)automated data-extraction literature of interest to different types of literature review. We identified a broad evidence base of publications describing data extraction for interventional reviews and a small number of publications extracting epidemiological or diagnostic accuracy data. Between review updates, trends for sharing data and code increased strongly: in the base-review, data and code were available for 13 and 19% respectively, these numbers increased to 78 and 87% within the 23 new publications. Compared with the base-review, we observed another research trend, away from straightforward data extraction and towards additionally extracting relations between entities or automatic text summarisation. With this living review we aim to review the literature continually. F1000 Research Limited 2023-10-09 /pmc/articles/PMC8361807/ /pubmed/34408850 http://dx.doi.org/10.12688/f1000research.51117.2 Text en Copyright: © 2023 Schmidt L et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Systematic Review
Schmidt, Lena
Finnerty Mutlu, Ailbhe N.
Elmore, Rebecca
Olorisade, Babatunde K.
Thomas, James
Higgins, Julian P. T.
Data extraction methods for systematic review (semi)automation: Update of a living systematic review
title Data extraction methods for systematic review (semi)automation: Update of a living systematic review
title_full Data extraction methods for systematic review (semi)automation: Update of a living systematic review
title_fullStr Data extraction methods for systematic review (semi)automation: Update of a living systematic review
title_full_unstemmed Data extraction methods for systematic review (semi)automation: Update of a living systematic review
title_short Data extraction methods for systematic review (semi)automation: Update of a living systematic review
title_sort data extraction methods for systematic review (semi)automation: update of a living systematic review
topic Systematic Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8361807/
https://www.ncbi.nlm.nih.gov/pubmed/34408850
http://dx.doi.org/10.12688/f1000research.51117.2
work_keys_str_mv AT schmidtlena dataextractionmethodsforsystematicreviewsemiautomationupdateofalivingsystematicreview
AT finnertymutluailbhen dataextractionmethodsforsystematicreviewsemiautomationupdateofalivingsystematicreview
AT elmorerebecca dataextractionmethodsforsystematicreviewsemiautomationupdateofalivingsystematicreview
AT olorisadebabatundek dataextractionmethodsforsystematicreviewsemiautomationupdateofalivingsystematicreview
AT thomasjames dataextractionmethodsforsystematicreviewsemiautomationupdateofalivingsystematicreview
AT higginsjulianpt dataextractionmethodsforsystematicreviewsemiautomationupdateofalivingsystematicreview