Cargando…

Risk of bias assessment in preclinical literature using natural language processing

We sought to apply natural language processing to the task of automatic risk of bias assessment in preclinical literature, which could speed the process of systematic review, provide information to guide research improvement activity, and support translation from preclinical to clinical research. We...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Qianying, Liao, Jing, Lapata, Mirella, Macleod, Malcolm
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9298308/
https://www.ncbi.nlm.nih.gov/pubmed/34709718
http://dx.doi.org/10.1002/jrsm.1533
_version_ 1784750675998015488
author Wang, Qianying
Liao, Jing
Lapata, Mirella
Macleod, Malcolm
author_facet Wang, Qianying
Liao, Jing
Lapata, Mirella
Macleod, Malcolm
author_sort Wang, Qianying
collection PubMed
description We sought to apply natural language processing to the task of automatic risk of bias assessment in preclinical literature, which could speed the process of systematic review, provide information to guide research improvement activity, and support translation from preclinical to clinical research. We use 7840 full‐text publications describing animal experiments with yes/no annotations for five risk of bias items. We implement a series of models including baselines (support vector machine, logistic regression, random forest), neural models (convolutional neural network, recurrent neural network with attention, hierarchical neural network) and models using BERT with two strategies (document chunk pooling and sentence extraction). We tune hyperparameters to obtain the highest F1 scores for each risk of bias item on the validation set and compare evaluation results on the test set to our previous regular expression approach. The F1 scores of best models on test set are 82.0% for random allocation, 81.6% for blinded assessment of outcome, 82.6% for conflict of interests, 91.4% for compliance with animal welfare regulations and 46.6% for reporting animals excluded from analysis. Our models significantly outperform regular expressions for four risk of bias items. For random allocation, blinded assessment of outcome, conflict of interests and animal exclusions, neural models achieve good performance; for animal welfare regulations, BERT model with a sentence extraction strategy works better. Convolutional neural networks are the overall best models. The tool is publicly available which may contribute to the future monitoring of risk of bias reporting for research improvement activities.
format Online
Article
Text
id pubmed-9298308
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-92983082022-07-21 Risk of bias assessment in preclinical literature using natural language processing Wang, Qianying Liao, Jing Lapata, Mirella Macleod, Malcolm Res Synth Methods Software Focus We sought to apply natural language processing to the task of automatic risk of bias assessment in preclinical literature, which could speed the process of systematic review, provide information to guide research improvement activity, and support translation from preclinical to clinical research. We use 7840 full‐text publications describing animal experiments with yes/no annotations for five risk of bias items. We implement a series of models including baselines (support vector machine, logistic regression, random forest), neural models (convolutional neural network, recurrent neural network with attention, hierarchical neural network) and models using BERT with two strategies (document chunk pooling and sentence extraction). We tune hyperparameters to obtain the highest F1 scores for each risk of bias item on the validation set and compare evaluation results on the test set to our previous regular expression approach. The F1 scores of best models on test set are 82.0% for random allocation, 81.6% for blinded assessment of outcome, 82.6% for conflict of interests, 91.4% for compliance with animal welfare regulations and 46.6% for reporting animals excluded from analysis. Our models significantly outperform regular expressions for four risk of bias items. For random allocation, blinded assessment of outcome, conflict of interests and animal exclusions, neural models achieve good performance; for animal welfare regulations, BERT model with a sentence extraction strategy works better. Convolutional neural networks are the overall best models. The tool is publicly available which may contribute to the future monitoring of risk of bias reporting for research improvement activities. John Wiley and Sons Inc. 2021-11-05 2022-05 /pmc/articles/PMC9298308/ /pubmed/34709718 http://dx.doi.org/10.1002/jrsm.1533 Text en © 2021 The Authors. Research Synthesis Methods published by John Wiley & Sons Ltd. https://creativecommons.org/licenses/by/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software Focus
Wang, Qianying
Liao, Jing
Lapata, Mirella
Macleod, Malcolm
Risk of bias assessment in preclinical literature using natural language processing
title Risk of bias assessment in preclinical literature using natural language processing
title_full Risk of bias assessment in preclinical literature using natural language processing
title_fullStr Risk of bias assessment in preclinical literature using natural language processing
title_full_unstemmed Risk of bias assessment in preclinical literature using natural language processing
title_short Risk of bias assessment in preclinical literature using natural language processing
title_sort risk of bias assessment in preclinical literature using natural language processing
topic Software Focus
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9298308/
https://www.ncbi.nlm.nih.gov/pubmed/34709718
http://dx.doi.org/10.1002/jrsm.1533
work_keys_str_mv AT wangqianying riskofbiasassessmentinpreclinicalliteratureusingnaturallanguageprocessing
AT liaojing riskofbiasassessmentinpreclinicalliteratureusingnaturallanguageprocessing
AT lapatamirella riskofbiasassessmentinpreclinicalliteratureusingnaturallanguageprocessing
AT macleodmalcolm riskofbiasassessmentinpreclinicalliteratureusingnaturallanguageprocessing