Cargando…

How suitable are clinical vignettes for the evaluation of symptom checker apps? A test theoretical perspective

OBJECTIVE: To evaluate the ability of case vignettes to assess the performance of symptom checker applications and to suggest refinements to the methodology used in case vignette-based audit studies. METHODS: We re-analyzed the publicly available data of two prominent case vignette-based symptom che...

Descripción completa

Detalles Bibliográficos
Autores principales:	Kopka, Marvin, Feufel, Markus A, Berner, Eta S, Schmieding, Malte L
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	SAGE Publications 2023
Materias:	Original Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10444026/ https://www.ncbi.nlm.nih.gov/pubmed/37614591 http://dx.doi.org/10.1177/20552076231194929

_version_	1785093961462841344
author	Kopka, Marvin Feufel, Markus A Berner, Eta S Schmieding, Malte L
author_facet	Kopka, Marvin Feufel, Markus A Berner, Eta S Schmieding, Malte L
author_sort	Kopka, Marvin
collection	PubMed
description	OBJECTIVE: To evaluate the ability of case vignettes to assess the performance of symptom checker applications and to suggest refinements to the methodology used in case vignette-based audit studies. METHODS: We re-analyzed the publicly available data of two prominent case vignette-based symptom checker audit studies by calculating common metrics of test theory. Furthermore, we developed a new metric, the Capability Comparison Score (CCS), which compares symptom checker capability while controlling for the difficulty of the set of cases each symptom checker evaluated. We then scrutinized whether applying test theory and the CCS altered the performance ranking of the investigated symptom checkers. RESULTS: In both studies, most symptom checkers changed their rank order when adjusting the triage capability for item difficulty (ID) with the CCS. The previously reported triage accuracies commonly overestimated the capability of symptom checkers because they did not account for the fact that symptom checkers tend to selectively appraise easier cases (i.e., with high ID values). Also, many case vignettes in both studies showed insufficient (very low and even negative) values of item-total correlation (ITC), suggesting that individual items or the composition of item sets are of low quality. CONCLUSIONS: A test–theoretic perspective helps identify previously undetected threats to the validity of case vignette-based symptom checker assessments and provides guidance and specific metrics to improve the quality of case vignettes, in particular by controlling for the difficulty of the vignettes an app was (not) able to evaluate correctly. Such measures might prove more meaningful than accuracy alone for the competitive assessment of symptom checkers. Our approach helps elaborate and standardize the methodology used for appraising symptom checker capability, which, ultimately, may yield more reliable results.
format	Online Article Text
id	pubmed-10444026
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	SAGE Publications
record_format	MEDLINE/PubMed
spelling	pubmed-104440262023-08-23 How suitable are clinical vignettes for the evaluation of symptom checker apps? A test theoretical perspective Kopka, Marvin Feufel, Markus A Berner, Eta S Schmieding, Malte L Digit Health Original Research OBJECTIVE: To evaluate the ability of case vignettes to assess the performance of symptom checker applications and to suggest refinements to the methodology used in case vignette-based audit studies. METHODS: We re-analyzed the publicly available data of two prominent case vignette-based symptom checker audit studies by calculating common metrics of test theory. Furthermore, we developed a new metric, the Capability Comparison Score (CCS), which compares symptom checker capability while controlling for the difficulty of the set of cases each symptom checker evaluated. We then scrutinized whether applying test theory and the CCS altered the performance ranking of the investigated symptom checkers. RESULTS: In both studies, most symptom checkers changed their rank order when adjusting the triage capability for item difficulty (ID) with the CCS. The previously reported triage accuracies commonly overestimated the capability of symptom checkers because they did not account for the fact that symptom checkers tend to selectively appraise easier cases (i.e., with high ID values). Also, many case vignettes in both studies showed insufficient (very low and even negative) values of item-total correlation (ITC), suggesting that individual items or the composition of item sets are of low quality. CONCLUSIONS: A test–theoretic perspective helps identify previously undetected threats to the validity of case vignette-based symptom checker assessments and provides guidance and specific metrics to improve the quality of case vignettes, in particular by controlling for the difficulty of the vignettes an app was (not) able to evaluate correctly. Such measures might prove more meaningful than accuracy alone for the competitive assessment of symptom checkers. Our approach helps elaborate and standardize the methodology used for appraising symptom checker capability, which, ultimately, may yield more reliable results. SAGE Publications 2023-08-21 /pmc/articles/PMC10444026/ /pubmed/37614591 http://dx.doi.org/10.1177/20552076231194929 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by-nc/4.0/This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (https://creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access page (https://us.sagepub.com/en-us/nam/open-access-at-sage).
spellingShingle	Original Research Kopka, Marvin Feufel, Markus A Berner, Eta S Schmieding, Malte L How suitable are clinical vignettes for the evaluation of symptom checker apps? A test theoretical perspective
title	How suitable are clinical vignettes for the evaluation of symptom checker apps? A test theoretical perspective
title_full	How suitable are clinical vignettes for the evaluation of symptom checker apps? A test theoretical perspective
title_fullStr	How suitable are clinical vignettes for the evaluation of symptom checker apps? A test theoretical perspective
title_full_unstemmed	How suitable are clinical vignettes for the evaluation of symptom checker apps? A test theoretical perspective
title_short	How suitable are clinical vignettes for the evaluation of symptom checker apps? A test theoretical perspective
title_sort	how suitable are clinical vignettes for the evaluation of symptom checker apps? a test theoretical perspective
topic	Original Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10444026/ https://www.ncbi.nlm.nih.gov/pubmed/37614591 http://dx.doi.org/10.1177/20552076231194929
work_keys_str_mv	AT kopkamarvin howsuitableareclinicalvignettesfortheevaluationofsymptomcheckerappsatesttheoreticalperspective AT feufelmarkusa howsuitableareclinicalvignettesfortheevaluationofsymptomcheckerappsatesttheoreticalperspective AT berneretas howsuitableareclinicalvignettesfortheevaluationofsymptomcheckerappsatesttheoreticalperspective AT schmiedingmaltel howsuitableareclinicalvignettesfortheevaluationofsymptomcheckerappsatesttheoreticalperspective

How suitable are clinical vignettes for the evaluation of symptom checker apps? A test theoretical perspective

Ejemplares similares