Cargando…

Assessment of Artificial Intelligence Performance on the Otolaryngology Residency In‐Service Exam

OBJECTIVES: This study seeks to determine the potential use and reliability of a large language learning model for answering questions in a sub‐specialized area of medicine, specifically practice exam questions in otolaryngology–head and neck surgery and assess its current efficacy for surgical trai...

Descripción completa

Detalles Bibliográficos
Autores principales: Mahajan, Arushi P., Shabet, Christina L., Smith, Joshua, Rudy, Shannon F., Kupfer, Robbi A., Bohm, Lauren A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10687376/
https://www.ncbi.nlm.nih.gov/pubmed/38034065
http://dx.doi.org/10.1002/oto2.98
_version_ 1785151964677406720
author Mahajan, Arushi P.
Shabet, Christina L.
Smith, Joshua
Rudy, Shannon F.
Kupfer, Robbi A.
Bohm, Lauren A.
author_facet Mahajan, Arushi P.
Shabet, Christina L.
Smith, Joshua
Rudy, Shannon F.
Kupfer, Robbi A.
Bohm, Lauren A.
author_sort Mahajan, Arushi P.
collection PubMed
description OBJECTIVES: This study seeks to determine the potential use and reliability of a large language learning model for answering questions in a sub‐specialized area of medicine, specifically practice exam questions in otolaryngology–head and neck surgery and assess its current efficacy for surgical trainees and learners. STUDY DESIGN AND SETTING: All available questions from a public, paid‐access question bank were manually input through ChatGPT. METHODS: Outputs from ChatGPT were compared against the benchmark of the answers and explanations from the question bank. Questions were assessed in 2 domains: accuracy and comprehensiveness of explanations. RESULTS: Overall, our study demonstrates a ChatGPT correct answer rate of 53% and a correct explanation rate of 54%. We find that with increasing difficulty of questions there is a decreasing rate of answer and explanation accuracy. CONCLUSION: Currently, artificial intelligence‐driven learning platforms are not robust enough to be reliable medical education resources to assist learners in sub‐specialty specific patient decision making scenarios.
format Online
Article
Text
id pubmed-10687376
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-106873762023-11-30 Assessment of Artificial Intelligence Performance on the Otolaryngology Residency In‐Service Exam Mahajan, Arushi P. Shabet, Christina L. Smith, Joshua Rudy, Shannon F. Kupfer, Robbi A. Bohm, Lauren A. OTO Open Original Research OBJECTIVES: This study seeks to determine the potential use and reliability of a large language learning model for answering questions in a sub‐specialized area of medicine, specifically practice exam questions in otolaryngology–head and neck surgery and assess its current efficacy for surgical trainees and learners. STUDY DESIGN AND SETTING: All available questions from a public, paid‐access question bank were manually input through ChatGPT. METHODS: Outputs from ChatGPT were compared against the benchmark of the answers and explanations from the question bank. Questions were assessed in 2 domains: accuracy and comprehensiveness of explanations. RESULTS: Overall, our study demonstrates a ChatGPT correct answer rate of 53% and a correct explanation rate of 54%. We find that with increasing difficulty of questions there is a decreasing rate of answer and explanation accuracy. CONCLUSION: Currently, artificial intelligence‐driven learning platforms are not robust enough to be reliable medical education resources to assist learners in sub‐specialty specific patient decision making scenarios. John Wiley and Sons Inc. 2023-11-29 /pmc/articles/PMC10687376/ /pubmed/38034065 http://dx.doi.org/10.1002/oto2.98 Text en © 2023 The Authors. OTO Open published by Wiley Periodicals LLC on behalf of American Academy of Otolaryngology–Head and Neck Surgery Foundation. https://creativecommons.org/licenses/by-nc/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes.
spellingShingle Original Research
Mahajan, Arushi P.
Shabet, Christina L.
Smith, Joshua
Rudy, Shannon F.
Kupfer, Robbi A.
Bohm, Lauren A.
Assessment of Artificial Intelligence Performance on the Otolaryngology Residency In‐Service Exam
title Assessment of Artificial Intelligence Performance on the Otolaryngology Residency In‐Service Exam
title_full Assessment of Artificial Intelligence Performance on the Otolaryngology Residency In‐Service Exam
title_fullStr Assessment of Artificial Intelligence Performance on the Otolaryngology Residency In‐Service Exam
title_full_unstemmed Assessment of Artificial Intelligence Performance on the Otolaryngology Residency In‐Service Exam
title_short Assessment of Artificial Intelligence Performance on the Otolaryngology Residency In‐Service Exam
title_sort assessment of artificial intelligence performance on the otolaryngology residency in‐service exam
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10687376/
https://www.ncbi.nlm.nih.gov/pubmed/38034065
http://dx.doi.org/10.1002/oto2.98
work_keys_str_mv AT mahajanaruship assessmentofartificialintelligenceperformanceontheotolaryngologyresidencyinserviceexam
AT shabetchristinal assessmentofartificialintelligenceperformanceontheotolaryngologyresidencyinserviceexam
AT smithjoshua assessmentofartificialintelligenceperformanceontheotolaryngologyresidencyinserviceexam
AT rudyshannonf assessmentofartificialintelligenceperformanceontheotolaryngologyresidencyinserviceexam
AT kupferrobbia assessmentofartificialintelligenceperformanceontheotolaryngologyresidencyinserviceexam
AT bohmlaurena assessmentofartificialintelligenceperformanceontheotolaryngologyresidencyinserviceexam