Cargando…

Efficacy of AI Chats to Determine an Emergency: A Comparison Between OpenAI’s ChatGPT, Google Bard, and Microsoft Bing AI Chat

Background The escalating overload and saturation of emergency services, primarily caused by non-urgent cases overwhelming the system, have spurred a critical necessity for innovative solutions that can effectively differentiate genuine emergencies from situations that could be managed through alter...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zúñiga Salazar, Gabriel, Zúñiga, Diego, Vindel, Carlos L, Yoong, Ana M, Hincapie, Sofia, Zúñiga, Ana B, Zúñiga, Paula, Salazar, Erin, Zúñiga, Byron
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Cureus 2023
Materias:	Emergency Medicine
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10506659/ https://www.ncbi.nlm.nih.gov/pubmed/37727841 http://dx.doi.org/10.7759/cureus.45473

_version_	1785107151184723968
author	Zúñiga Salazar, Gabriel Zúñiga, Diego Vindel, Carlos L Yoong, Ana M Hincapie, Sofia Zúñiga, Ana B Zúñiga, Paula Salazar, Erin Zúñiga, Byron
author_facet	Zúñiga Salazar, Gabriel Zúñiga, Diego Vindel, Carlos L Yoong, Ana M Hincapie, Sofia Zúñiga, Ana B Zúñiga, Paula Salazar, Erin Zúñiga, Byron
author_sort	Zúñiga Salazar, Gabriel
collection	PubMed
description	Background The escalating overload and saturation of emergency services, primarily caused by non-urgent cases overwhelming the system, have spurred a critical necessity for innovative solutions that can effectively differentiate genuine emergencies from situations that could be managed through alternative means, such as using AI chatbots. This study aims to evaluate and compare the accuracy in differentiating between a medical emergency and a non-emergency of three of the most popular AI chatbots at the moment. Methods In this study, patient questions from the online forum r/AskDocs on Reddit were collected to determine whether their clinical cases were emergencies. A total of 176 questions were reviewed by the authors, with 75 deemed emergencies and 101 non-emergencies. These questions were then posed to AI chatbots, including ChatGPT, Google Bard, and Microsoft Bing AI, with their responses evaluated against each other and the authors’ responses. A criteria-based system categorized the AI chatbot answers as “yes,” “no,” or “cannot determine.” The performance of each AI chatbot was compared in both emergency and non-emergency cases, and statistical analysis was conducted to assess the significance of differences in their performance. Results In general, AI chatbots considered around 12-15% more cases to be an emergency than reviewers, while they considered a very low number of cases as non-emergency compared to reviewers (around 35% fewer cases). Google Bard detected the most true emergency cases (87%) and true non-emergency cases (36%). However, no real difference in performance between the three AI chatbots was found in detecting true emergencies (p-value = 0.35) and non-emergency cases (p-value = 0.16). Conclusions These AI systems require further refinement to identify emergency situations accurately, but they could potentially be an innovative tool for emergency care and improving patient outcomes. The integration of AI chatbots like ChatGPT, Google Bard, and Microsoft Bing Chat offers a promising avenue to mitigate ED strain and enhance emergency management.
format	Online Article Text
id	pubmed-10506659
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Cureus
record_format	MEDLINE/PubMed
spelling	pubmed-105066592023-09-19 Efficacy of AI Chats to Determine an Emergency: A Comparison Between OpenAI’s ChatGPT, Google Bard, and Microsoft Bing AI Chat Zúñiga Salazar, Gabriel Zúñiga, Diego Vindel, Carlos L Yoong, Ana M Hincapie, Sofia Zúñiga, Ana B Zúñiga, Paula Salazar, Erin Zúñiga, Byron Cureus Emergency Medicine Background The escalating overload and saturation of emergency services, primarily caused by non-urgent cases overwhelming the system, have spurred a critical necessity for innovative solutions that can effectively differentiate genuine emergencies from situations that could be managed through alternative means, such as using AI chatbots. This study aims to evaluate and compare the accuracy in differentiating between a medical emergency and a non-emergency of three of the most popular AI chatbots at the moment. Methods In this study, patient questions from the online forum r/AskDocs on Reddit were collected to determine whether their clinical cases were emergencies. A total of 176 questions were reviewed by the authors, with 75 deemed emergencies and 101 non-emergencies. These questions were then posed to AI chatbots, including ChatGPT, Google Bard, and Microsoft Bing AI, with their responses evaluated against each other and the authors’ responses. A criteria-based system categorized the AI chatbot answers as “yes,” “no,” or “cannot determine.” The performance of each AI chatbot was compared in both emergency and non-emergency cases, and statistical analysis was conducted to assess the significance of differences in their performance. Results In general, AI chatbots considered around 12-15% more cases to be an emergency than reviewers, while they considered a very low number of cases as non-emergency compared to reviewers (around 35% fewer cases). Google Bard detected the most true emergency cases (87%) and true non-emergency cases (36%). However, no real difference in performance between the three AI chatbots was found in detecting true emergencies (p-value = 0.35) and non-emergency cases (p-value = 0.16). Conclusions These AI systems require further refinement to identify emergency situations accurately, but they could potentially be an innovative tool for emergency care and improving patient outcomes. The integration of AI chatbots like ChatGPT, Google Bard, and Microsoft Bing Chat offers a promising avenue to mitigate ED strain and enhance emergency management. Cureus 2023-09-18 /pmc/articles/PMC10506659/ /pubmed/37727841 http://dx.doi.org/10.7759/cureus.45473 Text en Copyright © 2023, Zúñiga Salazar et al. https://creativecommons.org/licenses/by/3.0/This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Emergency Medicine Zúñiga Salazar, Gabriel Zúñiga, Diego Vindel, Carlos L Yoong, Ana M Hincapie, Sofia Zúñiga, Ana B Zúñiga, Paula Salazar, Erin Zúñiga, Byron Efficacy of AI Chats to Determine an Emergency: A Comparison Between OpenAI’s ChatGPT, Google Bard, and Microsoft Bing AI Chat
title	Efficacy of AI Chats to Determine an Emergency: A Comparison Between OpenAI’s ChatGPT, Google Bard, and Microsoft Bing AI Chat
title_full	Efficacy of AI Chats to Determine an Emergency: A Comparison Between OpenAI’s ChatGPT, Google Bard, and Microsoft Bing AI Chat
title_fullStr	Efficacy of AI Chats to Determine an Emergency: A Comparison Between OpenAI’s ChatGPT, Google Bard, and Microsoft Bing AI Chat
title_full_unstemmed	Efficacy of AI Chats to Determine an Emergency: A Comparison Between OpenAI’s ChatGPT, Google Bard, and Microsoft Bing AI Chat
title_short	Efficacy of AI Chats to Determine an Emergency: A Comparison Between OpenAI’s ChatGPT, Google Bard, and Microsoft Bing AI Chat
title_sort	efficacy of ai chats to determine an emergency: a comparison between openai’s chatgpt, google bard, and microsoft bing ai chat
topic	Emergency Medicine
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10506659/ https://www.ncbi.nlm.nih.gov/pubmed/37727841 http://dx.doi.org/10.7759/cureus.45473
work_keys_str_mv	AT zunigasalazargabriel efficacyofaichatstodetermineanemergencyacomparisonbetweenopenaischatgptgooglebardandmicrosoftbingaichat AT zunigadiego efficacyofaichatstodetermineanemergencyacomparisonbetweenopenaischatgptgooglebardandmicrosoftbingaichat AT vindelcarlosl efficacyofaichatstodetermineanemergencyacomparisonbetweenopenaischatgptgooglebardandmicrosoftbingaichat AT yoonganam efficacyofaichatstodetermineanemergencyacomparisonbetweenopenaischatgptgooglebardandmicrosoftbingaichat AT hincapiesofia efficacyofaichatstodetermineanemergencyacomparisonbetweenopenaischatgptgooglebardandmicrosoftbingaichat AT zunigaanab efficacyofaichatstodetermineanemergencyacomparisonbetweenopenaischatgptgooglebardandmicrosoftbingaichat AT zunigapaula efficacyofaichatstodetermineanemergencyacomparisonbetweenopenaischatgptgooglebardandmicrosoftbingaichat AT salazarerin efficacyofaichatstodetermineanemergencyacomparisonbetweenopenaischatgptgooglebardandmicrosoftbingaichat AT zunigabyron efficacyofaichatstodetermineanemergencyacomparisonbetweenopenaischatgptgooglebardandmicrosoftbingaichat

Efficacy of AI Chats to Determine an Emergency: A Comparison Between OpenAI’s ChatGPT, Google Bard, and Microsoft Bing AI Chat

Ejemplares similares