Cargando…
Assessment of Artificial Intelligence Performance on the Otolaryngology Residency In‐Service Exam
OBJECTIVES: This study seeks to determine the potential use and reliability of a large language learning model for answering questions in a sub‐specialized area of medicine, specifically practice exam questions in otolaryngology–head and neck surgery and assess its current efficacy for surgical trai...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
John Wiley and Sons Inc.
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10687376/ https://www.ncbi.nlm.nih.gov/pubmed/38034065 http://dx.doi.org/10.1002/oto2.98 |
_version_ | 1785151964677406720 |
---|---|
author | Mahajan, Arushi P. Shabet, Christina L. Smith, Joshua Rudy, Shannon F. Kupfer, Robbi A. Bohm, Lauren A. |
author_facet | Mahajan, Arushi P. Shabet, Christina L. Smith, Joshua Rudy, Shannon F. Kupfer, Robbi A. Bohm, Lauren A. |
author_sort | Mahajan, Arushi P. |
collection | PubMed |
description | OBJECTIVES: This study seeks to determine the potential use and reliability of a large language learning model for answering questions in a sub‐specialized area of medicine, specifically practice exam questions in otolaryngology–head and neck surgery and assess its current efficacy for surgical trainees and learners. STUDY DESIGN AND SETTING: All available questions from a public, paid‐access question bank were manually input through ChatGPT. METHODS: Outputs from ChatGPT were compared against the benchmark of the answers and explanations from the question bank. Questions were assessed in 2 domains: accuracy and comprehensiveness of explanations. RESULTS: Overall, our study demonstrates a ChatGPT correct answer rate of 53% and a correct explanation rate of 54%. We find that with increasing difficulty of questions there is a decreasing rate of answer and explanation accuracy. CONCLUSION: Currently, artificial intelligence‐driven learning platforms are not robust enough to be reliable medical education resources to assist learners in sub‐specialty specific patient decision making scenarios. |
format | Online Article Text |
id | pubmed-10687376 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | John Wiley and Sons Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-106873762023-11-30 Assessment of Artificial Intelligence Performance on the Otolaryngology Residency In‐Service Exam Mahajan, Arushi P. Shabet, Christina L. Smith, Joshua Rudy, Shannon F. Kupfer, Robbi A. Bohm, Lauren A. OTO Open Original Research OBJECTIVES: This study seeks to determine the potential use and reliability of a large language learning model for answering questions in a sub‐specialized area of medicine, specifically practice exam questions in otolaryngology–head and neck surgery and assess its current efficacy for surgical trainees and learners. STUDY DESIGN AND SETTING: All available questions from a public, paid‐access question bank were manually input through ChatGPT. METHODS: Outputs from ChatGPT were compared against the benchmark of the answers and explanations from the question bank. Questions were assessed in 2 domains: accuracy and comprehensiveness of explanations. RESULTS: Overall, our study demonstrates a ChatGPT correct answer rate of 53% and a correct explanation rate of 54%. We find that with increasing difficulty of questions there is a decreasing rate of answer and explanation accuracy. CONCLUSION: Currently, artificial intelligence‐driven learning platforms are not robust enough to be reliable medical education resources to assist learners in sub‐specialty specific patient decision making scenarios. John Wiley and Sons Inc. 2023-11-29 /pmc/articles/PMC10687376/ /pubmed/38034065 http://dx.doi.org/10.1002/oto2.98 Text en © 2023 The Authors. OTO Open published by Wiley Periodicals LLC on behalf of American Academy of Otolaryngology–Head and Neck Surgery Foundation. https://creativecommons.org/licenses/by-nc/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes. |
spellingShingle | Original Research Mahajan, Arushi P. Shabet, Christina L. Smith, Joshua Rudy, Shannon F. Kupfer, Robbi A. Bohm, Lauren A. Assessment of Artificial Intelligence Performance on the Otolaryngology Residency In‐Service Exam |
title | Assessment of Artificial Intelligence Performance on the Otolaryngology Residency In‐Service Exam |
title_full | Assessment of Artificial Intelligence Performance on the Otolaryngology Residency In‐Service Exam |
title_fullStr | Assessment of Artificial Intelligence Performance on the Otolaryngology Residency In‐Service Exam |
title_full_unstemmed | Assessment of Artificial Intelligence Performance on the Otolaryngology Residency In‐Service Exam |
title_short | Assessment of Artificial Intelligence Performance on the Otolaryngology Residency In‐Service Exam |
title_sort | assessment of artificial intelligence performance on the otolaryngology residency in‐service exam |
topic | Original Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10687376/ https://www.ncbi.nlm.nih.gov/pubmed/38034065 http://dx.doi.org/10.1002/oto2.98 |
work_keys_str_mv | AT mahajanaruship assessmentofartificialintelligenceperformanceontheotolaryngologyresidencyinserviceexam AT shabetchristinal assessmentofartificialintelligenceperformanceontheotolaryngologyresidencyinserviceexam AT smithjoshua assessmentofartificialintelligenceperformanceontheotolaryngologyresidencyinserviceexam AT rudyshannonf assessmentofartificialintelligenceperformanceontheotolaryngologyresidencyinserviceexam AT kupferrobbia assessmentofartificialintelligenceperformanceontheotolaryngologyresidencyinserviceexam AT bohmlaurena assessmentofartificialintelligenceperformanceontheotolaryngologyresidencyinserviceexam |