Cargando…

An Automated Literature Review Tool (LiteRev) for Streamlining and Accelerating Research Using Natural Language Processing and Machine Learning: Descriptive Performance Evaluation Study

BACKGROUND: Literature reviews (LRs) identify, evaluate, and synthesize relevant papers to a particular research question to advance understanding and support decision-making. However, LRs, especially traditional systematic reviews, are slow, resource-intensive, and become outdated quickly. OBJECTIV...

Descripción completa

Detalles Bibliográficos
Autores principales: Orel, Erol, Ciglenecki, Iza, Thiabaud, Amaury, Temerev, Alexander, Calmy, Alexandra, Keiser, Olivia, Merzouki, Aziza
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10541641/
https://www.ncbi.nlm.nih.gov/pubmed/37713261
http://dx.doi.org/10.2196/39736
_version_ 1785113939439255552
author Orel, Erol
Ciglenecki, Iza
Thiabaud, Amaury
Temerev, Alexander
Calmy, Alexandra
Keiser, Olivia
Merzouki, Aziza
author_facet Orel, Erol
Ciglenecki, Iza
Thiabaud, Amaury
Temerev, Alexander
Calmy, Alexandra
Keiser, Olivia
Merzouki, Aziza
author_sort Orel, Erol
collection PubMed
description BACKGROUND: Literature reviews (LRs) identify, evaluate, and synthesize relevant papers to a particular research question to advance understanding and support decision-making. However, LRs, especially traditional systematic reviews, are slow, resource-intensive, and become outdated quickly. OBJECTIVE: LiteRev is an advanced and enhanced version of an existing automation tool designed to assist researchers in conducting LRs through the implementation of cutting-edge technologies such as natural language processing and machine learning techniques. In this paper, we present a comprehensive explanation of LiteRev’s capabilities, its methodology, and an evaluation of its accuracy and efficiency to a manual LR, highlighting the benefits of using LiteRev. METHODS: Based on the user’s query, LiteRev performs an automated search on a wide range of open-access databases and retrieves relevant metadata on the resulting papers, including abstracts or full texts when available. These abstracts (or full texts) are text processed and represented as a term frequency-inverse document frequency matrix. Using dimensionality reduction (pairwise controlled manifold approximation) and clustering (hierarchical density-based spatial clustering of applications with noise) techniques, the corpus is divided into different topics described by a list of the most important keywords. The user can then select one or several topics of interest, enter additional keywords to refine its search, or provide key papers to the research question. Based on these inputs, LiteRev performs a k-nearest neighbor (k-NN) search and suggests a list of potentially interesting papers. By tagging the relevant ones, the user triggers new k-NN searches until no additional paper is suggested for screening. To assess the performance of LiteRev, we ran it in parallel to a manual LR on the burden and care for acute and early HIV infection in sub-Saharan Africa. We assessed the performance of LiteRev using true and false predictive values, recall, and work saved over sampling. RESULTS: LiteRev extracted, processed, and transformed text into a term frequency-inverse document frequency matrix of 631 unique papers from PubMed. The topic modeling module identified 16 topics and highlighted 2 topics of interest to the research question. Based on 18 key papers, the k-NNs module suggested 193 papers for screening out of 613 papers in total (31.5% of the whole corpus) and correctly identified 64 relevant papers out of the 87 papers found by the manual abstract screening (recall rate of 73.6%). Compared to the manual full text screening, LiteRev identified 42 relevant papers out of the 48 papers found manually (recall rate of 87.5%). This represents a total work saved over sampling of 56%. CONCLUSIONS: We presented the features and functionalities of LiteRev, an automation tool that uses natural language processing and machine learning methods to streamline and accelerate LRs and support researchers in getting quick and in-depth overviews on any topic of interest.
format Online
Article
Text
id pubmed-10541641
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-105416412023-10-02 An Automated Literature Review Tool (LiteRev) for Streamlining and Accelerating Research Using Natural Language Processing and Machine Learning: Descriptive Performance Evaluation Study Orel, Erol Ciglenecki, Iza Thiabaud, Amaury Temerev, Alexander Calmy, Alexandra Keiser, Olivia Merzouki, Aziza J Med Internet Res Original Paper BACKGROUND: Literature reviews (LRs) identify, evaluate, and synthesize relevant papers to a particular research question to advance understanding and support decision-making. However, LRs, especially traditional systematic reviews, are slow, resource-intensive, and become outdated quickly. OBJECTIVE: LiteRev is an advanced and enhanced version of an existing automation tool designed to assist researchers in conducting LRs through the implementation of cutting-edge technologies such as natural language processing and machine learning techniques. In this paper, we present a comprehensive explanation of LiteRev’s capabilities, its methodology, and an evaluation of its accuracy and efficiency to a manual LR, highlighting the benefits of using LiteRev. METHODS: Based on the user’s query, LiteRev performs an automated search on a wide range of open-access databases and retrieves relevant metadata on the resulting papers, including abstracts or full texts when available. These abstracts (or full texts) are text processed and represented as a term frequency-inverse document frequency matrix. Using dimensionality reduction (pairwise controlled manifold approximation) and clustering (hierarchical density-based spatial clustering of applications with noise) techniques, the corpus is divided into different topics described by a list of the most important keywords. The user can then select one or several topics of interest, enter additional keywords to refine its search, or provide key papers to the research question. Based on these inputs, LiteRev performs a k-nearest neighbor (k-NN) search and suggests a list of potentially interesting papers. By tagging the relevant ones, the user triggers new k-NN searches until no additional paper is suggested for screening. To assess the performance of LiteRev, we ran it in parallel to a manual LR on the burden and care for acute and early HIV infection in sub-Saharan Africa. We assessed the performance of LiteRev using true and false predictive values, recall, and work saved over sampling. RESULTS: LiteRev extracted, processed, and transformed text into a term frequency-inverse document frequency matrix of 631 unique papers from PubMed. The topic modeling module identified 16 topics and highlighted 2 topics of interest to the research question. Based on 18 key papers, the k-NNs module suggested 193 papers for screening out of 613 papers in total (31.5% of the whole corpus) and correctly identified 64 relevant papers out of the 87 papers found by the manual abstract screening (recall rate of 73.6%). Compared to the manual full text screening, LiteRev identified 42 relevant papers out of the 48 papers found manually (recall rate of 87.5%). This represents a total work saved over sampling of 56%. CONCLUSIONS: We presented the features and functionalities of LiteRev, an automation tool that uses natural language processing and machine learning methods to streamline and accelerate LRs and support researchers in getting quick and in-depth overviews on any topic of interest. JMIR Publications 2023-09-15 /pmc/articles/PMC10541641/ /pubmed/37713261 http://dx.doi.org/10.2196/39736 Text en ©Erol Orel, Iza Ciglenecki, Amaury Thiabaud, Alexander Temerev, Alexandra Calmy, Olivia Keiser, Aziza Merzouki. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 15.09.2023. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Orel, Erol
Ciglenecki, Iza
Thiabaud, Amaury
Temerev, Alexander
Calmy, Alexandra
Keiser, Olivia
Merzouki, Aziza
An Automated Literature Review Tool (LiteRev) for Streamlining and Accelerating Research Using Natural Language Processing and Machine Learning: Descriptive Performance Evaluation Study
title An Automated Literature Review Tool (LiteRev) for Streamlining and Accelerating Research Using Natural Language Processing and Machine Learning: Descriptive Performance Evaluation Study
title_full An Automated Literature Review Tool (LiteRev) for Streamlining and Accelerating Research Using Natural Language Processing and Machine Learning: Descriptive Performance Evaluation Study
title_fullStr An Automated Literature Review Tool (LiteRev) for Streamlining and Accelerating Research Using Natural Language Processing and Machine Learning: Descriptive Performance Evaluation Study
title_full_unstemmed An Automated Literature Review Tool (LiteRev) for Streamlining and Accelerating Research Using Natural Language Processing and Machine Learning: Descriptive Performance Evaluation Study
title_short An Automated Literature Review Tool (LiteRev) for Streamlining and Accelerating Research Using Natural Language Processing and Machine Learning: Descriptive Performance Evaluation Study
title_sort automated literature review tool (literev) for streamlining and accelerating research using natural language processing and machine learning: descriptive performance evaluation study
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10541641/
https://www.ncbi.nlm.nih.gov/pubmed/37713261
http://dx.doi.org/10.2196/39736
work_keys_str_mv AT orelerol anautomatedliteraturereviewtoolliterevforstreamliningandacceleratingresearchusingnaturallanguageprocessingandmachinelearningdescriptiveperformanceevaluationstudy
AT cigleneckiiza anautomatedliteraturereviewtoolliterevforstreamliningandacceleratingresearchusingnaturallanguageprocessingandmachinelearningdescriptiveperformanceevaluationstudy
AT thiabaudamaury anautomatedliteraturereviewtoolliterevforstreamliningandacceleratingresearchusingnaturallanguageprocessingandmachinelearningdescriptiveperformanceevaluationstudy
AT temerevalexander anautomatedliteraturereviewtoolliterevforstreamliningandacceleratingresearchusingnaturallanguageprocessingandmachinelearningdescriptiveperformanceevaluationstudy
AT calmyalexandra anautomatedliteraturereviewtoolliterevforstreamliningandacceleratingresearchusingnaturallanguageprocessingandmachinelearningdescriptiveperformanceevaluationstudy
AT keiserolivia anautomatedliteraturereviewtoolliterevforstreamliningandacceleratingresearchusingnaturallanguageprocessingandmachinelearningdescriptiveperformanceevaluationstudy
AT merzoukiaziza anautomatedliteraturereviewtoolliterevforstreamliningandacceleratingresearchusingnaturallanguageprocessingandmachinelearningdescriptiveperformanceevaluationstudy
AT orelerol automatedliteraturereviewtoolliterevforstreamliningandacceleratingresearchusingnaturallanguageprocessingandmachinelearningdescriptiveperformanceevaluationstudy
AT cigleneckiiza automatedliteraturereviewtoolliterevforstreamliningandacceleratingresearchusingnaturallanguageprocessingandmachinelearningdescriptiveperformanceevaluationstudy
AT thiabaudamaury automatedliteraturereviewtoolliterevforstreamliningandacceleratingresearchusingnaturallanguageprocessingandmachinelearningdescriptiveperformanceevaluationstudy
AT temerevalexander automatedliteraturereviewtoolliterevforstreamliningandacceleratingresearchusingnaturallanguageprocessingandmachinelearningdescriptiveperformanceevaluationstudy
AT calmyalexandra automatedliteraturereviewtoolliterevforstreamliningandacceleratingresearchusingnaturallanguageprocessingandmachinelearningdescriptiveperformanceevaluationstudy
AT keiserolivia automatedliteraturereviewtoolliterevforstreamliningandacceleratingresearchusingnaturallanguageprocessingandmachinelearningdescriptiveperformanceevaluationstudy
AT merzoukiaziza automatedliteraturereviewtoolliterevforstreamliningandacceleratingresearchusingnaturallanguageprocessingandmachinelearningdescriptiveperformanceevaluationstudy