Cargando…

A Deep Learning Approach to Refine the Identification of High-Quality Clinical Research Articles From the Biomedical Literature: Protocol for Algorithm Development and Validation

BACKGROUND: A barrier to practicing evidence-based medicine is the rapidly increasing body of biomedical literature. Use of method terms to limit the search can help reduce the burden of screening articles for clinical relevance; however, such terms are limited by their partial dependence on indexin...

Descripción completa

Detalles Bibliográficos
Autores principales: Abdelkader, Wael, Navarro, Tamara, Parrish, Rick, Cotoi, Chris, Germini, Federico, Linkins, Lori-Ann, Iorio, Alfonso, Haynes, R Brian, Ananiadou, Sophia, Chu, Lingyang, Lokker, Cynthia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8669577/
https://www.ncbi.nlm.nih.gov/pubmed/34847061
http://dx.doi.org/10.2196/29398
_version_ 1784614807042785280
author Abdelkader, Wael
Navarro, Tamara
Parrish, Rick
Cotoi, Chris
Germini, Federico
Linkins, Lori-Ann
Iorio, Alfonso
Haynes, R Brian
Ananiadou, Sophia
Chu, Lingyang
Lokker, Cynthia
author_facet Abdelkader, Wael
Navarro, Tamara
Parrish, Rick
Cotoi, Chris
Germini, Federico
Linkins, Lori-Ann
Iorio, Alfonso
Haynes, R Brian
Ananiadou, Sophia
Chu, Lingyang
Lokker, Cynthia
author_sort Abdelkader, Wael
collection PubMed
description BACKGROUND: A barrier to practicing evidence-based medicine is the rapidly increasing body of biomedical literature. Use of method terms to limit the search can help reduce the burden of screening articles for clinical relevance; however, such terms are limited by their partial dependence on indexing terms and usually produce low precision, especially when high sensitivity is required. Machine learning has been applied to the identification of high-quality literature with the potential to achieve high precision without sacrificing sensitivity. The use of artificial intelligence has shown promise to improve the efficiency of identifying sound evidence. OBJECTIVE: The primary objective of this research is to derive and validate deep learning machine models using iterations of Bidirectional Encoder Representations from Transformers (BERT) to retrieve high-quality, high-relevance evidence for clinical consideration from the biomedical literature. METHODS: Using the HuggingFace Transformers library, we will experiment with variations of BERT models, including BERT, BioBERT, BlueBERT, and PubMedBERT, to determine which have the best performance in article identification based on quality criteria. Our experiments will utilize a large data set of over 150,000 PubMed citations from 2012 to 2020 that have been manually labeled based on their methodological rigor for clinical use. We will evaluate and report on the performance of the classifiers in categorizing articles based on their likelihood of meeting quality criteria. We will report fine-tuning hyperparameters for each model, as well as their performance metrics, including recall (sensitivity), specificity, precision, accuracy, F-score, the number of articles that need to be read before finding one that is positive (meets criteria), and classification probability scores. RESULTS: Initial model development is underway, with further development planned for early 2022. Performance testing is expected to star in February 2022. Results will be published in 2022. CONCLUSIONS: The experiments will aim to improve the precision of retrieving high-quality articles by applying a machine learning classifier to PubMed searching. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): DERR1-10.2196/29398
format Online
Article
Text
id pubmed-8669577
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-86695772022-01-10 A Deep Learning Approach to Refine the Identification of High-Quality Clinical Research Articles From the Biomedical Literature: Protocol for Algorithm Development and Validation Abdelkader, Wael Navarro, Tamara Parrish, Rick Cotoi, Chris Germini, Federico Linkins, Lori-Ann Iorio, Alfonso Haynes, R Brian Ananiadou, Sophia Chu, Lingyang Lokker, Cynthia JMIR Res Protoc Protocol BACKGROUND: A barrier to practicing evidence-based medicine is the rapidly increasing body of biomedical literature. Use of method terms to limit the search can help reduce the burden of screening articles for clinical relevance; however, such terms are limited by their partial dependence on indexing terms and usually produce low precision, especially when high sensitivity is required. Machine learning has been applied to the identification of high-quality literature with the potential to achieve high precision without sacrificing sensitivity. The use of artificial intelligence has shown promise to improve the efficiency of identifying sound evidence. OBJECTIVE: The primary objective of this research is to derive and validate deep learning machine models using iterations of Bidirectional Encoder Representations from Transformers (BERT) to retrieve high-quality, high-relevance evidence for clinical consideration from the biomedical literature. METHODS: Using the HuggingFace Transformers library, we will experiment with variations of BERT models, including BERT, BioBERT, BlueBERT, and PubMedBERT, to determine which have the best performance in article identification based on quality criteria. Our experiments will utilize a large data set of over 150,000 PubMed citations from 2012 to 2020 that have been manually labeled based on their methodological rigor for clinical use. We will evaluate and report on the performance of the classifiers in categorizing articles based on their likelihood of meeting quality criteria. We will report fine-tuning hyperparameters for each model, as well as their performance metrics, including recall (sensitivity), specificity, precision, accuracy, F-score, the number of articles that need to be read before finding one that is positive (meets criteria), and classification probability scores. RESULTS: Initial model development is underway, with further development planned for early 2022. Performance testing is expected to star in February 2022. Results will be published in 2022. CONCLUSIONS: The experiments will aim to improve the precision of retrieving high-quality articles by applying a machine learning classifier to PubMed searching. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): DERR1-10.2196/29398 JMIR Publications 2021-11-29 /pmc/articles/PMC8669577/ /pubmed/34847061 http://dx.doi.org/10.2196/29398 Text en ©Wael Abdelkader, Tamara Navarro, Rick Parrish, Chris Cotoi, Federico Germini, Lori-Ann Linkins, Alfonso Iorio, R Brian Haynes, Sophia Ananiadou, Lingyang Chu, Cynthia Lokker. Originally published in JMIR Research Protocols (https://www.researchprotocols.org), 29.11.2021. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Research Protocols, is properly cited. The complete bibliographic information, a link to the original publication on https://www.researchprotocols.org, as well as this copyright and license information must be included.
spellingShingle Protocol
Abdelkader, Wael
Navarro, Tamara
Parrish, Rick
Cotoi, Chris
Germini, Federico
Linkins, Lori-Ann
Iorio, Alfonso
Haynes, R Brian
Ananiadou, Sophia
Chu, Lingyang
Lokker, Cynthia
A Deep Learning Approach to Refine the Identification of High-Quality Clinical Research Articles From the Biomedical Literature: Protocol for Algorithm Development and Validation
title A Deep Learning Approach to Refine the Identification of High-Quality Clinical Research Articles From the Biomedical Literature: Protocol for Algorithm Development and Validation
title_full A Deep Learning Approach to Refine the Identification of High-Quality Clinical Research Articles From the Biomedical Literature: Protocol for Algorithm Development and Validation
title_fullStr A Deep Learning Approach to Refine the Identification of High-Quality Clinical Research Articles From the Biomedical Literature: Protocol for Algorithm Development and Validation
title_full_unstemmed A Deep Learning Approach to Refine the Identification of High-Quality Clinical Research Articles From the Biomedical Literature: Protocol for Algorithm Development and Validation
title_short A Deep Learning Approach to Refine the Identification of High-Quality Clinical Research Articles From the Biomedical Literature: Protocol for Algorithm Development and Validation
title_sort deep learning approach to refine the identification of high-quality clinical research articles from the biomedical literature: protocol for algorithm development and validation
topic Protocol
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8669577/
https://www.ncbi.nlm.nih.gov/pubmed/34847061
http://dx.doi.org/10.2196/29398
work_keys_str_mv AT abdelkaderwael adeeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation
AT navarrotamara adeeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation
AT parrishrick adeeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation
AT cotoichris adeeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation
AT germinifederico adeeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation
AT linkinsloriann adeeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation
AT iorioalfonso adeeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation
AT haynesrbrian adeeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation
AT ananiadousophia adeeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation
AT chulingyang adeeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation
AT lokkercynthia adeeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation
AT abdelkaderwael deeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation
AT navarrotamara deeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation
AT parrishrick deeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation
AT cotoichris deeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation
AT germinifederico deeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation
AT linkinsloriann deeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation
AT iorioalfonso deeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation
AT haynesrbrian deeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation
AT ananiadousophia deeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation
AT chulingyang deeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation
AT lokkercynthia deeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation