Cargando…

Diagnostic and Management Applications of ChatGPT in Structured Otolaryngology Clinical Scenarios

OBJECTIVE: To evaluate the clinical applications and limitations of chat generative pretrained transformer (ChatGPT) in otolaryngology. STUDY DESIGN: Cross‐sectional survey. SETTING: Tertiary academic center. METHODS: ChatGPT 4.0 was queried for diagnoses and management plans for 20 physician‐writte...

Descripción completa

Detalles Bibliográficos
Autores principales: Qu, Roy W., Qureshi, Uneeb, Petersen, Garrett, Lee, Steve C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10442607/
https://www.ncbi.nlm.nih.gov/pubmed/37614494
http://dx.doi.org/10.1002/oto2.67
Descripción
Sumario:OBJECTIVE: To evaluate the clinical applications and limitations of chat generative pretrained transformer (ChatGPT) in otolaryngology. STUDY DESIGN: Cross‐sectional survey. SETTING: Tertiary academic center. METHODS: ChatGPT 4.0 was queried for diagnoses and management plans for 20 physician‐written clinical vignettes in otolaryngology. Attending physicians were then asked to rate the difficulty of the clinical vignettes and agreement with the differential diagnoses and management plans of ChatGPT responses on a 5‐point Likert scale. Summary statistics were calculated. Univariate ordinal regression was then performed between vignette difficulty and quality of the diagnoses and management plans. RESULTS: Eleven attending physicians completed the survey (61% response rate). Overall, vignettes were rated as very easy to neutral difficulty (range of median score: 1.00‐4.00; overall median 2.00). There was a high agreement with the differential diagnosis provided by ChatGPT (range of median score: 3.00‐5.00; overall median: 5.00). There was also high agreement with treatment plans (range of median score: 3.00‐5.00; overall median: 5.00). There was no association between vignette difficulty and agreement with differential diagnosis or treatment. Lower diagnosis scores had greater odds of having lower treatment scores. CONCLUSION: Generative artificial intelligence models like ChatGPT are being rapidly adopted in medicine. Performance with curated, easy‐to‐moderate difficulty otolaryngology scenarios indicate high agreement with physicians for diagnosis and management. However, a decreased quality in diagnosis is associated with decreased quality in management. Further research is necessary on ChatGPT's ability to handle unstructured clinical information.