Cargando…

Semi-automating abstract screening with a natural language model pretrained on biomedical literature

We demonstrate the performance and workload impact of incorporating a natural language model, pretrained on citations of biomedical literature, on a workflow of abstract screening for studies on prognostic factors in end-stage lung disease. The model was optimized on one-third of the abstracts, and...

Descripción completa

Detalles Bibliográficos
Autores principales: Ng, Sheryl Hui-Xian, Teow, Kiok Liang, Ang, Gary Yee, Tan, Woan Shin, Hum, Allyn
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10517490/
https://www.ncbi.nlm.nih.gov/pubmed/37740227
http://dx.doi.org/10.1186/s13643-023-02353-8
_version_ 1785109333451735040
author Ng, Sheryl Hui-Xian
Teow, Kiok Liang
Ang, Gary Yee
Tan, Woan Shin
Hum, Allyn
author_facet Ng, Sheryl Hui-Xian
Teow, Kiok Liang
Ang, Gary Yee
Tan, Woan Shin
Hum, Allyn
author_sort Ng, Sheryl Hui-Xian
collection PubMed
description We demonstrate the performance and workload impact of incorporating a natural language model, pretrained on citations of biomedical literature, on a workflow of abstract screening for studies on prognostic factors in end-stage lung disease. The model was optimized on one-third of the abstracts, and model performance on the remaining abstracts was reported. Performance of the model, in terms of sensitivity, precision, F1 and inter-rater agreement, was moderate in comparison with other published models. However, incorporating it into the screening workflow, with the second reviewer screening only abstracts with conflicting decisions, translated into a 65% reduction in the number of abstracts screened by the second reviewer. Subsequent work will look at incorporating the pre-trained BERT model into screening workflows for other studies prospectively, as well as improving model performance. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13643-023-02353-8.
format Online
Article
Text
id pubmed-10517490
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-105174902023-09-24 Semi-automating abstract screening with a natural language model pretrained on biomedical literature Ng, Sheryl Hui-Xian Teow, Kiok Liang Ang, Gary Yee Tan, Woan Shin Hum, Allyn Syst Rev Letter We demonstrate the performance and workload impact of incorporating a natural language model, pretrained on citations of biomedical literature, on a workflow of abstract screening for studies on prognostic factors in end-stage lung disease. The model was optimized on one-third of the abstracts, and model performance on the remaining abstracts was reported. Performance of the model, in terms of sensitivity, precision, F1 and inter-rater agreement, was moderate in comparison with other published models. However, incorporating it into the screening workflow, with the second reviewer screening only abstracts with conflicting decisions, translated into a 65% reduction in the number of abstracts screened by the second reviewer. Subsequent work will look at incorporating the pre-trained BERT model into screening workflows for other studies prospectively, as well as improving model performance. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13643-023-02353-8. BioMed Central 2023-09-23 /pmc/articles/PMC10517490/ /pubmed/37740227 http://dx.doi.org/10.1186/s13643-023-02353-8 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Letter
Ng, Sheryl Hui-Xian
Teow, Kiok Liang
Ang, Gary Yee
Tan, Woan Shin
Hum, Allyn
Semi-automating abstract screening with a natural language model pretrained on biomedical literature
title Semi-automating abstract screening with a natural language model pretrained on biomedical literature
title_full Semi-automating abstract screening with a natural language model pretrained on biomedical literature
title_fullStr Semi-automating abstract screening with a natural language model pretrained on biomedical literature
title_full_unstemmed Semi-automating abstract screening with a natural language model pretrained on biomedical literature
title_short Semi-automating abstract screening with a natural language model pretrained on biomedical literature
title_sort semi-automating abstract screening with a natural language model pretrained on biomedical literature
topic Letter
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10517490/
https://www.ncbi.nlm.nih.gov/pubmed/37740227
http://dx.doi.org/10.1186/s13643-023-02353-8
work_keys_str_mv AT ngsherylhuixian semiautomatingabstractscreeningwithanaturallanguagemodelpretrainedonbiomedicalliterature
AT teowkiokliang semiautomatingabstractscreeningwithanaturallanguagemodelpretrainedonbiomedicalliterature
AT anggaryyee semiautomatingabstractscreeningwithanaturallanguagemodelpretrainedonbiomedicalliterature
AT tanwoanshin semiautomatingabstractscreeningwithanaturallanguagemodelpretrainedonbiomedicalliterature
AT humallyn semiautomatingabstractscreeningwithanaturallanguagemodelpretrainedonbiomedicalliterature