Cargando…
An Automated Literature Review Tool (LiteRev) for Streamlining and Accelerating Research Using Natural Language Processing and Machine Learning: Descriptive Performance Evaluation Study
BACKGROUND: Literature reviews (LRs) identify, evaluate, and synthesize relevant papers to a particular research question to advance understanding and support decision-making. However, LRs, especially traditional systematic reviews, are slow, resource-intensive, and become outdated quickly. OBJECTIV...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
JMIR Publications
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10541641/ https://www.ncbi.nlm.nih.gov/pubmed/37713261 http://dx.doi.org/10.2196/39736 |
_version_ | 1785113939439255552 |
---|---|
author | Orel, Erol Ciglenecki, Iza Thiabaud, Amaury Temerev, Alexander Calmy, Alexandra Keiser, Olivia Merzouki, Aziza |
author_facet | Orel, Erol Ciglenecki, Iza Thiabaud, Amaury Temerev, Alexander Calmy, Alexandra Keiser, Olivia Merzouki, Aziza |
author_sort | Orel, Erol |
collection | PubMed |
description | BACKGROUND: Literature reviews (LRs) identify, evaluate, and synthesize relevant papers to a particular research question to advance understanding and support decision-making. However, LRs, especially traditional systematic reviews, are slow, resource-intensive, and become outdated quickly. OBJECTIVE: LiteRev is an advanced and enhanced version of an existing automation tool designed to assist researchers in conducting LRs through the implementation of cutting-edge technologies such as natural language processing and machine learning techniques. In this paper, we present a comprehensive explanation of LiteRev’s capabilities, its methodology, and an evaluation of its accuracy and efficiency to a manual LR, highlighting the benefits of using LiteRev. METHODS: Based on the user’s query, LiteRev performs an automated search on a wide range of open-access databases and retrieves relevant metadata on the resulting papers, including abstracts or full texts when available. These abstracts (or full texts) are text processed and represented as a term frequency-inverse document frequency matrix. Using dimensionality reduction (pairwise controlled manifold approximation) and clustering (hierarchical density-based spatial clustering of applications with noise) techniques, the corpus is divided into different topics described by a list of the most important keywords. The user can then select one or several topics of interest, enter additional keywords to refine its search, or provide key papers to the research question. Based on these inputs, LiteRev performs a k-nearest neighbor (k-NN) search and suggests a list of potentially interesting papers. By tagging the relevant ones, the user triggers new k-NN searches until no additional paper is suggested for screening. To assess the performance of LiteRev, we ran it in parallel to a manual LR on the burden and care for acute and early HIV infection in sub-Saharan Africa. We assessed the performance of LiteRev using true and false predictive values, recall, and work saved over sampling. RESULTS: LiteRev extracted, processed, and transformed text into a term frequency-inverse document frequency matrix of 631 unique papers from PubMed. The topic modeling module identified 16 topics and highlighted 2 topics of interest to the research question. Based on 18 key papers, the k-NNs module suggested 193 papers for screening out of 613 papers in total (31.5% of the whole corpus) and correctly identified 64 relevant papers out of the 87 papers found by the manual abstract screening (recall rate of 73.6%). Compared to the manual full text screening, LiteRev identified 42 relevant papers out of the 48 papers found manually (recall rate of 87.5%). This represents a total work saved over sampling of 56%. CONCLUSIONS: We presented the features and functionalities of LiteRev, an automation tool that uses natural language processing and machine learning methods to streamline and accelerate LRs and support researchers in getting quick and in-depth overviews on any topic of interest. |
format | Online Article Text |
id | pubmed-10541641 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | JMIR Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-105416412023-10-02 An Automated Literature Review Tool (LiteRev) for Streamlining and Accelerating Research Using Natural Language Processing and Machine Learning: Descriptive Performance Evaluation Study Orel, Erol Ciglenecki, Iza Thiabaud, Amaury Temerev, Alexander Calmy, Alexandra Keiser, Olivia Merzouki, Aziza J Med Internet Res Original Paper BACKGROUND: Literature reviews (LRs) identify, evaluate, and synthesize relevant papers to a particular research question to advance understanding and support decision-making. However, LRs, especially traditional systematic reviews, are slow, resource-intensive, and become outdated quickly. OBJECTIVE: LiteRev is an advanced and enhanced version of an existing automation tool designed to assist researchers in conducting LRs through the implementation of cutting-edge technologies such as natural language processing and machine learning techniques. In this paper, we present a comprehensive explanation of LiteRev’s capabilities, its methodology, and an evaluation of its accuracy and efficiency to a manual LR, highlighting the benefits of using LiteRev. METHODS: Based on the user’s query, LiteRev performs an automated search on a wide range of open-access databases and retrieves relevant metadata on the resulting papers, including abstracts or full texts when available. These abstracts (or full texts) are text processed and represented as a term frequency-inverse document frequency matrix. Using dimensionality reduction (pairwise controlled manifold approximation) and clustering (hierarchical density-based spatial clustering of applications with noise) techniques, the corpus is divided into different topics described by a list of the most important keywords. The user can then select one or several topics of interest, enter additional keywords to refine its search, or provide key papers to the research question. Based on these inputs, LiteRev performs a k-nearest neighbor (k-NN) search and suggests a list of potentially interesting papers. By tagging the relevant ones, the user triggers new k-NN searches until no additional paper is suggested for screening. To assess the performance of LiteRev, we ran it in parallel to a manual LR on the burden and care for acute and early HIV infection in sub-Saharan Africa. We assessed the performance of LiteRev using true and false predictive values, recall, and work saved over sampling. RESULTS: LiteRev extracted, processed, and transformed text into a term frequency-inverse document frequency matrix of 631 unique papers from PubMed. The topic modeling module identified 16 topics and highlighted 2 topics of interest to the research question. Based on 18 key papers, the k-NNs module suggested 193 papers for screening out of 613 papers in total (31.5% of the whole corpus) and correctly identified 64 relevant papers out of the 87 papers found by the manual abstract screening (recall rate of 73.6%). Compared to the manual full text screening, LiteRev identified 42 relevant papers out of the 48 papers found manually (recall rate of 87.5%). This represents a total work saved over sampling of 56%. CONCLUSIONS: We presented the features and functionalities of LiteRev, an automation tool that uses natural language processing and machine learning methods to streamline and accelerate LRs and support researchers in getting quick and in-depth overviews on any topic of interest. JMIR Publications 2023-09-15 /pmc/articles/PMC10541641/ /pubmed/37713261 http://dx.doi.org/10.2196/39736 Text en ©Erol Orel, Iza Ciglenecki, Amaury Thiabaud, Alexander Temerev, Alexandra Calmy, Olivia Keiser, Aziza Merzouki. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 15.09.2023. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included. |
spellingShingle | Original Paper Orel, Erol Ciglenecki, Iza Thiabaud, Amaury Temerev, Alexander Calmy, Alexandra Keiser, Olivia Merzouki, Aziza An Automated Literature Review Tool (LiteRev) for Streamlining and Accelerating Research Using Natural Language Processing and Machine Learning: Descriptive Performance Evaluation Study |
title | An Automated Literature Review Tool (LiteRev) for Streamlining and Accelerating Research Using Natural Language Processing and Machine Learning: Descriptive Performance Evaluation Study |
title_full | An Automated Literature Review Tool (LiteRev) for Streamlining and Accelerating Research Using Natural Language Processing and Machine Learning: Descriptive Performance Evaluation Study |
title_fullStr | An Automated Literature Review Tool (LiteRev) for Streamlining and Accelerating Research Using Natural Language Processing and Machine Learning: Descriptive Performance Evaluation Study |
title_full_unstemmed | An Automated Literature Review Tool (LiteRev) for Streamlining and Accelerating Research Using Natural Language Processing and Machine Learning: Descriptive Performance Evaluation Study |
title_short | An Automated Literature Review Tool (LiteRev) for Streamlining and Accelerating Research Using Natural Language Processing and Machine Learning: Descriptive Performance Evaluation Study |
title_sort | automated literature review tool (literev) for streamlining and accelerating research using natural language processing and machine learning: descriptive performance evaluation study |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10541641/ https://www.ncbi.nlm.nih.gov/pubmed/37713261 http://dx.doi.org/10.2196/39736 |
work_keys_str_mv | AT orelerol anautomatedliteraturereviewtoolliterevforstreamliningandacceleratingresearchusingnaturallanguageprocessingandmachinelearningdescriptiveperformanceevaluationstudy AT cigleneckiiza anautomatedliteraturereviewtoolliterevforstreamliningandacceleratingresearchusingnaturallanguageprocessingandmachinelearningdescriptiveperformanceevaluationstudy AT thiabaudamaury anautomatedliteraturereviewtoolliterevforstreamliningandacceleratingresearchusingnaturallanguageprocessingandmachinelearningdescriptiveperformanceevaluationstudy AT temerevalexander anautomatedliteraturereviewtoolliterevforstreamliningandacceleratingresearchusingnaturallanguageprocessingandmachinelearningdescriptiveperformanceevaluationstudy AT calmyalexandra anautomatedliteraturereviewtoolliterevforstreamliningandacceleratingresearchusingnaturallanguageprocessingandmachinelearningdescriptiveperformanceevaluationstudy AT keiserolivia anautomatedliteraturereviewtoolliterevforstreamliningandacceleratingresearchusingnaturallanguageprocessingandmachinelearningdescriptiveperformanceevaluationstudy AT merzoukiaziza anautomatedliteraturereviewtoolliterevforstreamliningandacceleratingresearchusingnaturallanguageprocessingandmachinelearningdescriptiveperformanceevaluationstudy AT orelerol automatedliteraturereviewtoolliterevforstreamliningandacceleratingresearchusingnaturallanguageprocessingandmachinelearningdescriptiveperformanceevaluationstudy AT cigleneckiiza automatedliteraturereviewtoolliterevforstreamliningandacceleratingresearchusingnaturallanguageprocessingandmachinelearningdescriptiveperformanceevaluationstudy AT thiabaudamaury automatedliteraturereviewtoolliterevforstreamliningandacceleratingresearchusingnaturallanguageprocessingandmachinelearningdescriptiveperformanceevaluationstudy AT temerevalexander automatedliteraturereviewtoolliterevforstreamliningandacceleratingresearchusingnaturallanguageprocessingandmachinelearningdescriptiveperformanceevaluationstudy AT calmyalexandra automatedliteraturereviewtoolliterevforstreamliningandacceleratingresearchusingnaturallanguageprocessingandmachinelearningdescriptiveperformanceevaluationstudy AT keiserolivia automatedliteraturereviewtoolliterevforstreamliningandacceleratingresearchusingnaturallanguageprocessingandmachinelearningdescriptiveperformanceevaluationstudy AT merzoukiaziza automatedliteraturereviewtoolliterevforstreamliningandacceleratingresearchusingnaturallanguageprocessingandmachinelearningdescriptiveperformanceevaluationstudy |