Cargando…

Assessing the Performance of a New Artificial Intelligence–Driven Diagnostic Support Tool Using Medical Board Exam Simulations: Clinical Vignette Study

BACKGROUND: Diagnostic decision support systems (DDSS) are computer programs aimed to improve health care by supporting clinicians in the process of diagnostic decision-making. Previous studies on DDSS demonstrated their ability to enhance clinicians’ diagnostic skills, prevent diagnostic errors, an...

Descripción completa

Detalles Bibliográficos
Autores principales: Ben-Shabat, Niv, Sloma, Ariel, Weizman, Tomer, Kiderman, David, Amital, Howard
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8672291/
https://www.ncbi.nlm.nih.gov/pubmed/34672262
http://dx.doi.org/10.2196/32507
_version_ 1784615329697103872
author Ben-Shabat, Niv
Sloma, Ariel
Weizman, Tomer
Kiderman, David
Amital, Howard
author_facet Ben-Shabat, Niv
Sloma, Ariel
Weizman, Tomer
Kiderman, David
Amital, Howard
author_sort Ben-Shabat, Niv
collection PubMed
description BACKGROUND: Diagnostic decision support systems (DDSS) are computer programs aimed to improve health care by supporting clinicians in the process of diagnostic decision-making. Previous studies on DDSS demonstrated their ability to enhance clinicians’ diagnostic skills, prevent diagnostic errors, and reduce hospitalization costs. Despite the potential benefits, their utilization in clinical practice is limited, emphasizing the need for new and improved products. OBJECTIVE: The aim of this study was to conduct a preliminary analysis of the diagnostic performance of “Kahun,” a new artificial intelligence-driven diagnostic tool. METHODS: Diagnostic performance was evaluated based on the program’s ability to “solve” clinical cases from the United States Medical Licensing Examination Step 2 Clinical Skills board exam simulations that were drawn from the case banks of 3 leading preparation companies. Each case included 3 expected differential diagnoses. The cases were entered into the Kahun platform by 3 blinded junior physicians. For each case, the presence and the rank of the correct diagnoses within the generated differential diagnoses list were recorded. Each diagnostic performance was measured in two ways: first, as diagnostic sensitivity, and second, as case-specific success rates that represent diagnostic comprehensiveness. RESULTS: The study included 91 clinical cases with 78 different chief complaints and a mean number of 38 (SD 8) findings for each case. The total number of expected diagnoses was 272, of which 174 were different (some appeared more than once). Of the 272 expected diagnoses, 231 (87.5%; 95% CI 76-99) diagnoses were suggested within the top 20 listed diagnoses, 209 (76.8%; 95% CI 66-87) were suggested within the top 10, and 168 (61.8%; 95% CI 52-71) within the top 5. The median rank of correct diagnoses was 3 (IQR 2-6). Of the 91 expected diagnoses, 62 (68%; 95% CI 59-78) of the cases were suggested within the top 20 listed diagnoses, 44 (48%; 95% CI 38-59) within the top 10, and 24 (26%; 95% CI 17-35) within the top 5. Of the 91 expected diagnoses, in 87 (96%; 95% CI 91-100), at least 2 out of 3 of the cases’ expected diagnoses were suggested within the top 20 listed diagnoses; 78 (86%; 95% CI 79-93) were suggested within the top 10; and 61 (67%; 95% CI 57-77) within the top 5. CONCLUSIONS: The diagnostic support tool evaluated in this study demonstrated good diagnostic accuracy and comprehensiveness; it also had the ability to manage a wide range of clinical findings.
format Online
Article
Text
id pubmed-8672291
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-86722912022-01-10 Assessing the Performance of a New Artificial Intelligence–Driven Diagnostic Support Tool Using Medical Board Exam Simulations: Clinical Vignette Study Ben-Shabat, Niv Sloma, Ariel Weizman, Tomer Kiderman, David Amital, Howard JMIR Med Inform Original Paper BACKGROUND: Diagnostic decision support systems (DDSS) are computer programs aimed to improve health care by supporting clinicians in the process of diagnostic decision-making. Previous studies on DDSS demonstrated their ability to enhance clinicians’ diagnostic skills, prevent diagnostic errors, and reduce hospitalization costs. Despite the potential benefits, their utilization in clinical practice is limited, emphasizing the need for new and improved products. OBJECTIVE: The aim of this study was to conduct a preliminary analysis of the diagnostic performance of “Kahun,” a new artificial intelligence-driven diagnostic tool. METHODS: Diagnostic performance was evaluated based on the program’s ability to “solve” clinical cases from the United States Medical Licensing Examination Step 2 Clinical Skills board exam simulations that were drawn from the case banks of 3 leading preparation companies. Each case included 3 expected differential diagnoses. The cases were entered into the Kahun platform by 3 blinded junior physicians. For each case, the presence and the rank of the correct diagnoses within the generated differential diagnoses list were recorded. Each diagnostic performance was measured in two ways: first, as diagnostic sensitivity, and second, as case-specific success rates that represent diagnostic comprehensiveness. RESULTS: The study included 91 clinical cases with 78 different chief complaints and a mean number of 38 (SD 8) findings for each case. The total number of expected diagnoses was 272, of which 174 were different (some appeared more than once). Of the 272 expected diagnoses, 231 (87.5%; 95% CI 76-99) diagnoses were suggested within the top 20 listed diagnoses, 209 (76.8%; 95% CI 66-87) were suggested within the top 10, and 168 (61.8%; 95% CI 52-71) within the top 5. The median rank of correct diagnoses was 3 (IQR 2-6). Of the 91 expected diagnoses, 62 (68%; 95% CI 59-78) of the cases were suggested within the top 20 listed diagnoses, 44 (48%; 95% CI 38-59) within the top 10, and 24 (26%; 95% CI 17-35) within the top 5. Of the 91 expected diagnoses, in 87 (96%; 95% CI 91-100), at least 2 out of 3 of the cases’ expected diagnoses were suggested within the top 20 listed diagnoses; 78 (86%; 95% CI 79-93) were suggested within the top 10; and 61 (67%; 95% CI 57-77) within the top 5. CONCLUSIONS: The diagnostic support tool evaluated in this study demonstrated good diagnostic accuracy and comprehensiveness; it also had the ability to manage a wide range of clinical findings. JMIR Publications 2021-11-30 /pmc/articles/PMC8672291/ /pubmed/34672262 http://dx.doi.org/10.2196/32507 Text en ©Niv Ben-Shabat, Ariel Sloma, Tomer Weizman, David Kiderman, Howard Amital. Originally published in JMIR Medical Informatics (https://medinform.jmir.org), 30.11.2021. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on https://medinform.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Ben-Shabat, Niv
Sloma, Ariel
Weizman, Tomer
Kiderman, David
Amital, Howard
Assessing the Performance of a New Artificial Intelligence–Driven Diagnostic Support Tool Using Medical Board Exam Simulations: Clinical Vignette Study
title Assessing the Performance of a New Artificial Intelligence–Driven Diagnostic Support Tool Using Medical Board Exam Simulations: Clinical Vignette Study
title_full Assessing the Performance of a New Artificial Intelligence–Driven Diagnostic Support Tool Using Medical Board Exam Simulations: Clinical Vignette Study
title_fullStr Assessing the Performance of a New Artificial Intelligence–Driven Diagnostic Support Tool Using Medical Board Exam Simulations: Clinical Vignette Study
title_full_unstemmed Assessing the Performance of a New Artificial Intelligence–Driven Diagnostic Support Tool Using Medical Board Exam Simulations: Clinical Vignette Study
title_short Assessing the Performance of a New Artificial Intelligence–Driven Diagnostic Support Tool Using Medical Board Exam Simulations: Clinical Vignette Study
title_sort assessing the performance of a new artificial intelligence–driven diagnostic support tool using medical board exam simulations: clinical vignette study
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8672291/
https://www.ncbi.nlm.nih.gov/pubmed/34672262
http://dx.doi.org/10.2196/32507
work_keys_str_mv AT benshabatniv assessingtheperformanceofanewartificialintelligencedrivendiagnosticsupporttoolusingmedicalboardexamsimulationsclinicalvignettestudy
AT slomaariel assessingtheperformanceofanewartificialintelligencedrivendiagnosticsupporttoolusingmedicalboardexamsimulationsclinicalvignettestudy
AT weizmantomer assessingtheperformanceofanewartificialintelligencedrivendiagnosticsupporttoolusingmedicalboardexamsimulationsclinicalvignettestudy
AT kidermandavid assessingtheperformanceofanewartificialintelligencedrivendiagnosticsupporttoolusingmedicalboardexamsimulationsclinicalvignettestudy
AT amitalhoward assessingtheperformanceofanewartificialintelligencedrivendiagnosticsupporttoolusingmedicalboardexamsimulationsclinicalvignettestudy