Cargando…
Performance of active learning models for screening prioritization in systematic reviews: a simulation study into the Average Time to Discover relevant records
BACKGROUND: Conducting a systematic review demands a significant amount of effort in screening titles and abstracts. To accelerate this process, various tools that utilize active learning have been proposed. These tools allow the reviewer to interact with machine learning software to identify releva...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10280866/ https://www.ncbi.nlm.nih.gov/pubmed/37340494 http://dx.doi.org/10.1186/s13643-023-02257-7 |
_version_ | 1785060891723563008 |
---|---|
author | Ferdinands, Gerbrich Schram, Raoul de Bruin, Jonathan Bagheri, Ayoub Oberski, Daniel L. Tummers, Lars Teijema, Jelle Jasper van de Schoot, Rens |
author_facet | Ferdinands, Gerbrich Schram, Raoul de Bruin, Jonathan Bagheri, Ayoub Oberski, Daniel L. Tummers, Lars Teijema, Jelle Jasper van de Schoot, Rens |
author_sort | Ferdinands, Gerbrich |
collection | PubMed |
description | BACKGROUND: Conducting a systematic review demands a significant amount of effort in screening titles and abstracts. To accelerate this process, various tools that utilize active learning have been proposed. These tools allow the reviewer to interact with machine learning software to identify relevant publications as early as possible. The goal of this study is to gain a comprehensive understanding of active learning models for reducing the workload in systematic reviews through a simulation study. METHODS: The simulation study mimics the process of a human reviewer screening records while interacting with an active learning model. Different active learning models were compared based on four classification techniques (naive Bayes, logistic regression, support vector machines, and random forest) and two feature extraction strategies (TF-IDF and doc2vec). The performance of the models was compared for six systematic review datasets from different research areas. The evaluation of the models was based on the Work Saved over Sampling (WSS) and recall. Additionally, this study introduces two new statistics, Time to Discovery (TD) and Average Time to Discovery (ATD). RESULTS: The models reduce the number of publications needed to screen by 91.7 to 63.9% while still finding 95% of all relevant records (WSS@95). Recall of the models was defined as the proportion of relevant records found after screening 10% of of all records and ranges from 53.6 to 99.8%. The ATD values range from 1.4% till 11.7%, which indicate the average proportion of labeling decisions the researcher needs to make to detect a relevant record. The ATD values display a similar ranking across the simulations as the recall and WSS values. CONCLUSIONS: Active learning models for screening prioritization demonstrate significant potential for reducing the workload in systematic reviews. The Naive Bayes + TF-IDF model yielded the best results overall. The Average Time to Discovery (ATD) measures performance of active learning models throughout the entire screening process without the need for an arbitrary cut-off point. This makes the ATD a promising metric for comparing the performance of different models across different datasets. |
format | Online Article Text |
id | pubmed-10280866 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-102808662023-06-21 Performance of active learning models for screening prioritization in systematic reviews: a simulation study into the Average Time to Discover relevant records Ferdinands, Gerbrich Schram, Raoul de Bruin, Jonathan Bagheri, Ayoub Oberski, Daniel L. Tummers, Lars Teijema, Jelle Jasper van de Schoot, Rens Syst Rev Research BACKGROUND: Conducting a systematic review demands a significant amount of effort in screening titles and abstracts. To accelerate this process, various tools that utilize active learning have been proposed. These tools allow the reviewer to interact with machine learning software to identify relevant publications as early as possible. The goal of this study is to gain a comprehensive understanding of active learning models for reducing the workload in systematic reviews through a simulation study. METHODS: The simulation study mimics the process of a human reviewer screening records while interacting with an active learning model. Different active learning models were compared based on four classification techniques (naive Bayes, logistic regression, support vector machines, and random forest) and two feature extraction strategies (TF-IDF and doc2vec). The performance of the models was compared for six systematic review datasets from different research areas. The evaluation of the models was based on the Work Saved over Sampling (WSS) and recall. Additionally, this study introduces two new statistics, Time to Discovery (TD) and Average Time to Discovery (ATD). RESULTS: The models reduce the number of publications needed to screen by 91.7 to 63.9% while still finding 95% of all relevant records (WSS@95). Recall of the models was defined as the proportion of relevant records found after screening 10% of of all records and ranges from 53.6 to 99.8%. The ATD values range from 1.4% till 11.7%, which indicate the average proportion of labeling decisions the researcher needs to make to detect a relevant record. The ATD values display a similar ranking across the simulations as the recall and WSS values. CONCLUSIONS: Active learning models for screening prioritization demonstrate significant potential for reducing the workload in systematic reviews. The Naive Bayes + TF-IDF model yielded the best results overall. The Average Time to Discovery (ATD) measures performance of active learning models throughout the entire screening process without the need for an arbitrary cut-off point. This makes the ATD a promising metric for comparing the performance of different models across different datasets. BioMed Central 2023-06-20 /pmc/articles/PMC10280866/ /pubmed/37340494 http://dx.doi.org/10.1186/s13643-023-02257-7 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Ferdinands, Gerbrich Schram, Raoul de Bruin, Jonathan Bagheri, Ayoub Oberski, Daniel L. Tummers, Lars Teijema, Jelle Jasper van de Schoot, Rens Performance of active learning models for screening prioritization in systematic reviews: a simulation study into the Average Time to Discover relevant records |
title | Performance of active learning models for screening prioritization in systematic reviews: a simulation study into the Average Time to Discover relevant records |
title_full | Performance of active learning models for screening prioritization in systematic reviews: a simulation study into the Average Time to Discover relevant records |
title_fullStr | Performance of active learning models for screening prioritization in systematic reviews: a simulation study into the Average Time to Discover relevant records |
title_full_unstemmed | Performance of active learning models for screening prioritization in systematic reviews: a simulation study into the Average Time to Discover relevant records |
title_short | Performance of active learning models for screening prioritization in systematic reviews: a simulation study into the Average Time to Discover relevant records |
title_sort | performance of active learning models for screening prioritization in systematic reviews: a simulation study into the average time to discover relevant records |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10280866/ https://www.ncbi.nlm.nih.gov/pubmed/37340494 http://dx.doi.org/10.1186/s13643-023-02257-7 |
work_keys_str_mv | AT ferdinandsgerbrich performanceofactivelearningmodelsforscreeningprioritizationinsystematicreviewsasimulationstudyintotheaveragetimetodiscoverrelevantrecords AT schramraoul performanceofactivelearningmodelsforscreeningprioritizationinsystematicreviewsasimulationstudyintotheaveragetimetodiscoverrelevantrecords AT debruinjonathan performanceofactivelearningmodelsforscreeningprioritizationinsystematicreviewsasimulationstudyintotheaveragetimetodiscoverrelevantrecords AT bagheriayoub performanceofactivelearningmodelsforscreeningprioritizationinsystematicreviewsasimulationstudyintotheaveragetimetodiscoverrelevantrecords AT oberskidaniell performanceofactivelearningmodelsforscreeningprioritizationinsystematicreviewsasimulationstudyintotheaveragetimetodiscoverrelevantrecords AT tummerslars performanceofactivelearningmodelsforscreeningprioritizationinsystematicreviewsasimulationstudyintotheaveragetimetodiscoverrelevantrecords AT teijemajellejasper performanceofactivelearningmodelsforscreeningprioritizationinsystematicreviewsasimulationstudyintotheaveragetimetodiscoverrelevantrecords AT vandeschootrens performanceofactivelearningmodelsforscreeningprioritizationinsystematicreviewsasimulationstudyintotheaveragetimetodiscoverrelevantrecords |