Cargando…
A Deep Learning Approach to Refine the Identification of High-Quality Clinical Research Articles From the Biomedical Literature: Protocol for Algorithm Development and Validation
BACKGROUND: A barrier to practicing evidence-based medicine is the rapidly increasing body of biomedical literature. Use of method terms to limit the search can help reduce the burden of screening articles for clinical relevance; however, such terms are limited by their partial dependence on indexin...
Autores principales: | , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
JMIR Publications
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8669577/ https://www.ncbi.nlm.nih.gov/pubmed/34847061 http://dx.doi.org/10.2196/29398 |
_version_ | 1784614807042785280 |
---|---|
author | Abdelkader, Wael Navarro, Tamara Parrish, Rick Cotoi, Chris Germini, Federico Linkins, Lori-Ann Iorio, Alfonso Haynes, R Brian Ananiadou, Sophia Chu, Lingyang Lokker, Cynthia |
author_facet | Abdelkader, Wael Navarro, Tamara Parrish, Rick Cotoi, Chris Germini, Federico Linkins, Lori-Ann Iorio, Alfonso Haynes, R Brian Ananiadou, Sophia Chu, Lingyang Lokker, Cynthia |
author_sort | Abdelkader, Wael |
collection | PubMed |
description | BACKGROUND: A barrier to practicing evidence-based medicine is the rapidly increasing body of biomedical literature. Use of method terms to limit the search can help reduce the burden of screening articles for clinical relevance; however, such terms are limited by their partial dependence on indexing terms and usually produce low precision, especially when high sensitivity is required. Machine learning has been applied to the identification of high-quality literature with the potential to achieve high precision without sacrificing sensitivity. The use of artificial intelligence has shown promise to improve the efficiency of identifying sound evidence. OBJECTIVE: The primary objective of this research is to derive and validate deep learning machine models using iterations of Bidirectional Encoder Representations from Transformers (BERT) to retrieve high-quality, high-relevance evidence for clinical consideration from the biomedical literature. METHODS: Using the HuggingFace Transformers library, we will experiment with variations of BERT models, including BERT, BioBERT, BlueBERT, and PubMedBERT, to determine which have the best performance in article identification based on quality criteria. Our experiments will utilize a large data set of over 150,000 PubMed citations from 2012 to 2020 that have been manually labeled based on their methodological rigor for clinical use. We will evaluate and report on the performance of the classifiers in categorizing articles based on their likelihood of meeting quality criteria. We will report fine-tuning hyperparameters for each model, as well as their performance metrics, including recall (sensitivity), specificity, precision, accuracy, F-score, the number of articles that need to be read before finding one that is positive (meets criteria), and classification probability scores. RESULTS: Initial model development is underway, with further development planned for early 2022. Performance testing is expected to star in February 2022. Results will be published in 2022. CONCLUSIONS: The experiments will aim to improve the precision of retrieving high-quality articles by applying a machine learning classifier to PubMed searching. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): DERR1-10.2196/29398 |
format | Online Article Text |
id | pubmed-8669577 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | JMIR Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-86695772022-01-10 A Deep Learning Approach to Refine the Identification of High-Quality Clinical Research Articles From the Biomedical Literature: Protocol for Algorithm Development and Validation Abdelkader, Wael Navarro, Tamara Parrish, Rick Cotoi, Chris Germini, Federico Linkins, Lori-Ann Iorio, Alfonso Haynes, R Brian Ananiadou, Sophia Chu, Lingyang Lokker, Cynthia JMIR Res Protoc Protocol BACKGROUND: A barrier to practicing evidence-based medicine is the rapidly increasing body of biomedical literature. Use of method terms to limit the search can help reduce the burden of screening articles for clinical relevance; however, such terms are limited by their partial dependence on indexing terms and usually produce low precision, especially when high sensitivity is required. Machine learning has been applied to the identification of high-quality literature with the potential to achieve high precision without sacrificing sensitivity. The use of artificial intelligence has shown promise to improve the efficiency of identifying sound evidence. OBJECTIVE: The primary objective of this research is to derive and validate deep learning machine models using iterations of Bidirectional Encoder Representations from Transformers (BERT) to retrieve high-quality, high-relevance evidence for clinical consideration from the biomedical literature. METHODS: Using the HuggingFace Transformers library, we will experiment with variations of BERT models, including BERT, BioBERT, BlueBERT, and PubMedBERT, to determine which have the best performance in article identification based on quality criteria. Our experiments will utilize a large data set of over 150,000 PubMed citations from 2012 to 2020 that have been manually labeled based on their methodological rigor for clinical use. We will evaluate and report on the performance of the classifiers in categorizing articles based on their likelihood of meeting quality criteria. We will report fine-tuning hyperparameters for each model, as well as their performance metrics, including recall (sensitivity), specificity, precision, accuracy, F-score, the number of articles that need to be read before finding one that is positive (meets criteria), and classification probability scores. RESULTS: Initial model development is underway, with further development planned for early 2022. Performance testing is expected to star in February 2022. Results will be published in 2022. CONCLUSIONS: The experiments will aim to improve the precision of retrieving high-quality articles by applying a machine learning classifier to PubMed searching. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): DERR1-10.2196/29398 JMIR Publications 2021-11-29 /pmc/articles/PMC8669577/ /pubmed/34847061 http://dx.doi.org/10.2196/29398 Text en ©Wael Abdelkader, Tamara Navarro, Rick Parrish, Chris Cotoi, Federico Germini, Lori-Ann Linkins, Alfonso Iorio, R Brian Haynes, Sophia Ananiadou, Lingyang Chu, Cynthia Lokker. Originally published in JMIR Research Protocols (https://www.researchprotocols.org), 29.11.2021. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Research Protocols, is properly cited. The complete bibliographic information, a link to the original publication on https://www.researchprotocols.org, as well as this copyright and license information must be included. |
spellingShingle | Protocol Abdelkader, Wael Navarro, Tamara Parrish, Rick Cotoi, Chris Germini, Federico Linkins, Lori-Ann Iorio, Alfonso Haynes, R Brian Ananiadou, Sophia Chu, Lingyang Lokker, Cynthia A Deep Learning Approach to Refine the Identification of High-Quality Clinical Research Articles From the Biomedical Literature: Protocol for Algorithm Development and Validation |
title | A Deep Learning Approach to Refine the Identification of High-Quality Clinical Research Articles From the Biomedical Literature: Protocol for Algorithm Development and Validation |
title_full | A Deep Learning Approach to Refine the Identification of High-Quality Clinical Research Articles From the Biomedical Literature: Protocol for Algorithm Development and Validation |
title_fullStr | A Deep Learning Approach to Refine the Identification of High-Quality Clinical Research Articles From the Biomedical Literature: Protocol for Algorithm Development and Validation |
title_full_unstemmed | A Deep Learning Approach to Refine the Identification of High-Quality Clinical Research Articles From the Biomedical Literature: Protocol for Algorithm Development and Validation |
title_short | A Deep Learning Approach to Refine the Identification of High-Quality Clinical Research Articles From the Biomedical Literature: Protocol for Algorithm Development and Validation |
title_sort | deep learning approach to refine the identification of high-quality clinical research articles from the biomedical literature: protocol for algorithm development and validation |
topic | Protocol |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8669577/ https://www.ncbi.nlm.nih.gov/pubmed/34847061 http://dx.doi.org/10.2196/29398 |
work_keys_str_mv | AT abdelkaderwael adeeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation AT navarrotamara adeeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation AT parrishrick adeeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation AT cotoichris adeeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation AT germinifederico adeeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation AT linkinsloriann adeeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation AT iorioalfonso adeeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation AT haynesrbrian adeeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation AT ananiadousophia adeeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation AT chulingyang adeeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation AT lokkercynthia adeeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation AT abdelkaderwael deeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation AT navarrotamara deeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation AT parrishrick deeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation AT cotoichris deeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation AT germinifederico deeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation AT linkinsloriann deeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation AT iorioalfonso deeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation AT haynesrbrian deeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation AT ananiadousophia deeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation AT chulingyang deeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation AT lokkercynthia deeplearningapproachtorefinetheidentificationofhighqualityclinicalresearcharticlesfromthebiomedicalliteratureprotocolforalgorithmdevelopmentandvalidation |