Cargando…

Automating Quality Assessment of Medical Evidence in Systematic Reviews: Model Development and Validation Study

BACKGROUND: Assessment of the quality of medical evidence available on the web is a critical step in the preparation of systematic reviews. Existing tools that automate parts of this task validate the quality of individual studies but not of entire bodies of evidence and focus on a restricted set of...

Descripción completa

Detalles Bibliográficos
Autores principales:	Šuster, Simon, Baldwin, Timothy, Lau, Jey Han, Jimeno Yepes, Antonio, Martinez Iraola, David, Otmakhova, Yulia, Verspoor, Karin
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	JMIR Publications 2023
Materias:	Original Paper
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10131699/ https://www.ncbi.nlm.nih.gov/pubmed/36722350 http://dx.doi.org/10.2196/35568

_version_	1785031232564756480
author	Šuster, Simon Baldwin, Timothy Lau, Jey Han Jimeno Yepes, Antonio Martinez Iraola, David Otmakhova, Yulia Verspoor, Karin
author_facet	Šuster, Simon Baldwin, Timothy Lau, Jey Han Jimeno Yepes, Antonio Martinez Iraola, David Otmakhova, Yulia Verspoor, Karin
author_sort	Šuster, Simon
collection	PubMed
description	BACKGROUND: Assessment of the quality of medical evidence available on the web is a critical step in the preparation of systematic reviews. Existing tools that automate parts of this task validate the quality of individual studies but not of entire bodies of evidence and focus on a restricted set of quality criteria. OBJECTIVE: We proposed a quality assessment task that provides an overall quality rating for each body of evidence (BoE), as well as finer-grained justification for different quality criteria according to the Grading of Recommendation, Assessment, Development, and Evaluation formalization framework. For this purpose, we constructed a new data set and developed a machine learning baseline system (EvidenceGRADEr). METHODS: We algorithmically extracted quality-related data from all summaries of findings found in the Cochrane Database of Systematic Reviews. Each BoE was defined by a set of population, intervention, comparison, and outcome criteria and assigned a quality grade (high, moderate, low, or very low) together with quality criteria (justification) that influenced that decision. Different statistical data, metadata about the review, and parts of the review text were extracted as support for grading each BoE. After pruning the resulting data set with various quality checks, we used it to train several neural-model variants. The predictions were compared against the labels originally assigned by the authors of the systematic reviews. RESULTS: Our quality assessment data set, Cochrane Database of Systematic Reviews Quality of Evidence, contains 13,440 instances, or BoEs labeled for quality, originating from 2252 systematic reviews published on the internet from 2002 to 2020. On the basis of a 10-fold cross-validation, the best neural binary classifiers for quality criteria detected risk of bias at 0.78 F(1) (P=.68; R=0.92) and imprecision at 0.75 F(1) (P=.66; R=0.86), while the performance on inconsistency, indirectness, and publication bias criteria was lower (F(1) in the range of 0.3-0.4). The prediction of the overall quality grade into 1 of the 4 levels resulted in 0.5 F(1). When casting the task as a binary problem by merging the Grading of Recommendation, Assessment, Development, and Evaluation classes (high+moderate vs low+very low-quality evidence), we attained 0.74 F(1). We also found that the results varied depending on the supporting information that is provided as an input to the models. CONCLUSIONS: Different factors affect the quality of evidence in the context of systematic reviews of medical evidence. Some of these (risk of bias and imprecision) can be automated with reasonable accuracy. Other quality dimensions such as indirectness, inconsistency, and publication bias prove more challenging for machine learning, largely because they are much rarer. This technology could substantially reduce reviewer workload in the future and expedite quality assessment as part of evidence synthesis.
format	Online Article Text
id	pubmed-10131699
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	JMIR Publications
record_format	MEDLINE/PubMed
spelling	pubmed-101316992023-04-27 Automating Quality Assessment of Medical Evidence in Systematic Reviews: Model Development and Validation Study Šuster, Simon Baldwin, Timothy Lau, Jey Han Jimeno Yepes, Antonio Martinez Iraola, David Otmakhova, Yulia Verspoor, Karin J Med Internet Res Original Paper BACKGROUND: Assessment of the quality of medical evidence available on the web is a critical step in the preparation of systematic reviews. Existing tools that automate parts of this task validate the quality of individual studies but not of entire bodies of evidence and focus on a restricted set of quality criteria. OBJECTIVE: We proposed a quality assessment task that provides an overall quality rating for each body of evidence (BoE), as well as finer-grained justification for different quality criteria according to the Grading of Recommendation, Assessment, Development, and Evaluation formalization framework. For this purpose, we constructed a new data set and developed a machine learning baseline system (EvidenceGRADEr). METHODS: We algorithmically extracted quality-related data from all summaries of findings found in the Cochrane Database of Systematic Reviews. Each BoE was defined by a set of population, intervention, comparison, and outcome criteria and assigned a quality grade (high, moderate, low, or very low) together with quality criteria (justification) that influenced that decision. Different statistical data, metadata about the review, and parts of the review text were extracted as support for grading each BoE. After pruning the resulting data set with various quality checks, we used it to train several neural-model variants. The predictions were compared against the labels originally assigned by the authors of the systematic reviews. RESULTS: Our quality assessment data set, Cochrane Database of Systematic Reviews Quality of Evidence, contains 13,440 instances, or BoEs labeled for quality, originating from 2252 systematic reviews published on the internet from 2002 to 2020. On the basis of a 10-fold cross-validation, the best neural binary classifiers for quality criteria detected risk of bias at 0.78 F(1) (P=.68; R=0.92) and imprecision at 0.75 F(1) (P=.66; R=0.86), while the performance on inconsistency, indirectness, and publication bias criteria was lower (F(1) in the range of 0.3-0.4). The prediction of the overall quality grade into 1 of the 4 levels resulted in 0.5 F(1). When casting the task as a binary problem by merging the Grading of Recommendation, Assessment, Development, and Evaluation classes (high+moderate vs low+very low-quality evidence), we attained 0.74 F(1). We also found that the results varied depending on the supporting information that is provided as an input to the models. CONCLUSIONS: Different factors affect the quality of evidence in the context of systematic reviews of medical evidence. Some of these (risk of bias and imprecision) can be automated with reasonable accuracy. Other quality dimensions such as indirectness, inconsistency, and publication bias prove more challenging for machine learning, largely because they are much rarer. This technology could substantially reduce reviewer workload in the future and expedite quality assessment as part of evidence synthesis. JMIR Publications 2023-03-13 /pmc/articles/PMC10131699/ /pubmed/36722350 http://dx.doi.org/10.2196/35568 Text en ©Simon Šuster, Timothy Baldwin, Jey Han Lau, Antonio Jimeno Yepes, David Martinez Iraola, Yulia Otmakhova, Karin Verspoor. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 13.03.2023. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.
spellingShingle	Original Paper Šuster, Simon Baldwin, Timothy Lau, Jey Han Jimeno Yepes, Antonio Martinez Iraola, David Otmakhova, Yulia Verspoor, Karin Automating Quality Assessment of Medical Evidence in Systematic Reviews: Model Development and Validation Study
title	Automating Quality Assessment of Medical Evidence in Systematic Reviews: Model Development and Validation Study
title_full	Automating Quality Assessment of Medical Evidence in Systematic Reviews: Model Development and Validation Study
title_fullStr	Automating Quality Assessment of Medical Evidence in Systematic Reviews: Model Development and Validation Study
title_full_unstemmed	Automating Quality Assessment of Medical Evidence in Systematic Reviews: Model Development and Validation Study
title_short	Automating Quality Assessment of Medical Evidence in Systematic Reviews: Model Development and Validation Study
title_sort	automating quality assessment of medical evidence in systematic reviews: model development and validation study
topic	Original Paper
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10131699/ https://www.ncbi.nlm.nih.gov/pubmed/36722350 http://dx.doi.org/10.2196/35568
work_keys_str_mv	AT sustersimon automatingqualityassessmentofmedicalevidenceinsystematicreviewsmodeldevelopmentandvalidationstudy AT baldwintimothy automatingqualityassessmentofmedicalevidenceinsystematicreviewsmodeldevelopmentandvalidationstudy AT laujeyhan automatingqualityassessmentofmedicalevidenceinsystematicreviewsmodeldevelopmentandvalidationstudy AT jimenoyepesantonio automatingqualityassessmentofmedicalevidenceinsystematicreviewsmodeldevelopmentandvalidationstudy AT martineziraoladavid automatingqualityassessmentofmedicalevidenceinsystematicreviewsmodeldevelopmentandvalidationstudy AT otmakhovayulia automatingqualityassessmentofmedicalevidenceinsystematicreviewsmodeldevelopmentandvalidationstudy AT verspoorkarin automatingqualityassessmentofmedicalevidenceinsystematicreviewsmodeldevelopmentandvalidationstudy

Automating Quality Assessment of Medical Evidence in Systematic Reviews: Model Development and Validation Study

Ejemplares similares