Cargando…

Putting ChatGPT’s Medical Advice to the (Turing) Test: Survey Study

BACKGROUND: Chatbots are being piloted to draft responses to patient questions, but patients’ ability to distinguish between provider and chatbot responses and patients’ trust in chatbots’ functions are not well established. OBJECTIVE: This study aimed to assess the feasibility of using ChatGPT (Cha...

Descripción completa

Detalles Bibliográficos
Autores principales:	Nov, Oded, Singh, Nina, Mann, Devin
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	JMIR Publications 2023
Materias:	Original Paper
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10366957/ https://www.ncbi.nlm.nih.gov/pubmed/37428540 http://dx.doi.org/10.2196/46939

_version_	1785077283685400576
author	Nov, Oded Singh, Nina Mann, Devin
author_facet	Nov, Oded Singh, Nina Mann, Devin
author_sort	Nov, Oded
collection	PubMed
description	BACKGROUND: Chatbots are being piloted to draft responses to patient questions, but patients’ ability to distinguish between provider and chatbot responses and patients’ trust in chatbots’ functions are not well established. OBJECTIVE: This study aimed to assess the feasibility of using ChatGPT (Chat Generative Pre-trained Transformer) or a similar artificial intelligence–based chatbot for patient-provider communication. METHODS: A survey study was conducted in January 2023. Ten representative, nonadministrative patient-provider interactions were extracted from the electronic health record. Patients’ questions were entered into ChatGPT with a request for the chatbot to respond using approximately the same word count as the human provider’s response. In the survey, each patient question was followed by a provider- or ChatGPT-generated response. Participants were informed that 5 responses were provider generated and 5 were chatbot generated. Participants were asked—and incentivized financially—to correctly identify the response source. Participants were also asked about their trust in chatbots’ functions in patient-provider communication, using a Likert scale from 1-5. RESULTS: A US-representative sample of 430 study participants aged 18 and older were recruited on Prolific, a crowdsourcing platform for academic studies. In all, 426 participants filled out the full survey. After removing participants who spent less than 3 minutes on the survey, 392 respondents remained. Overall, 53.3% (209/392) of respondents analyzed were women, and the average age was 47.1 (range 18-91) years. The correct classification of responses ranged between 49% (192/392) to 85.7% (336/392) for different questions. On average, chatbot responses were identified correctly in 65.5% (1284/1960) of the cases, and human provider responses were identified correctly in 65.1% (1276/1960) of the cases. On average, responses toward patients’ trust in chatbots’ functions were weakly positive (mean Likert score 3.4 out of 5), with lower trust as the health-related complexity of the task in the questions increased. CONCLUSIONS: ChatGPT responses to patient questions were weakly distinguishable from provider responses. Laypeople appear to trust the use of chatbots to answer lower-risk health questions. It is important to continue studying patient-chatbot interaction as chatbots move from administrative to more clinical roles in health care.
format	Online Article Text
id	pubmed-10366957
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	JMIR Publications
record_format	MEDLINE/PubMed
spelling	pubmed-103669572023-07-26 Putting ChatGPT’s Medical Advice to the (Turing) Test: Survey Study Nov, Oded Singh, Nina Mann, Devin JMIR Med Educ Original Paper BACKGROUND: Chatbots are being piloted to draft responses to patient questions, but patients’ ability to distinguish between provider and chatbot responses and patients’ trust in chatbots’ functions are not well established. OBJECTIVE: This study aimed to assess the feasibility of using ChatGPT (Chat Generative Pre-trained Transformer) or a similar artificial intelligence–based chatbot for patient-provider communication. METHODS: A survey study was conducted in January 2023. Ten representative, nonadministrative patient-provider interactions were extracted from the electronic health record. Patients’ questions were entered into ChatGPT with a request for the chatbot to respond using approximately the same word count as the human provider’s response. In the survey, each patient question was followed by a provider- or ChatGPT-generated response. Participants were informed that 5 responses were provider generated and 5 were chatbot generated. Participants were asked—and incentivized financially—to correctly identify the response source. Participants were also asked about their trust in chatbots’ functions in patient-provider communication, using a Likert scale from 1-5. RESULTS: A US-representative sample of 430 study participants aged 18 and older were recruited on Prolific, a crowdsourcing platform for academic studies. In all, 426 participants filled out the full survey. After removing participants who spent less than 3 minutes on the survey, 392 respondents remained. Overall, 53.3% (209/392) of respondents analyzed were women, and the average age was 47.1 (range 18-91) years. The correct classification of responses ranged between 49% (192/392) to 85.7% (336/392) for different questions. On average, chatbot responses were identified correctly in 65.5% (1284/1960) of the cases, and human provider responses were identified correctly in 65.1% (1276/1960) of the cases. On average, responses toward patients’ trust in chatbots’ functions were weakly positive (mean Likert score 3.4 out of 5), with lower trust as the health-related complexity of the task in the questions increased. CONCLUSIONS: ChatGPT responses to patient questions were weakly distinguishable from provider responses. Laypeople appear to trust the use of chatbots to answer lower-risk health questions. It is important to continue studying patient-chatbot interaction as chatbots move from administrative to more clinical roles in health care. JMIR Publications 2023-07-10 /pmc/articles/PMC10366957/ /pubmed/37428540 http://dx.doi.org/10.2196/46939 Text en ©Oded Nov, Nina Singh, Devin Mann. Originally published in JMIR Medical Education (https://mededu.jmir.org), 10.07.2023. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Education, is properly cited. The complete bibliographic information, a link to the original publication on https://mededu.jmir.org/, as well as this copyright and license information must be included.
spellingShingle	Original Paper Nov, Oded Singh, Nina Mann, Devin Putting ChatGPT’s Medical Advice to the (Turing) Test: Survey Study
title	Putting ChatGPT’s Medical Advice to the (Turing) Test: Survey Study
title_full	Putting ChatGPT’s Medical Advice to the (Turing) Test: Survey Study
title_fullStr	Putting ChatGPT’s Medical Advice to the (Turing) Test: Survey Study
title_full_unstemmed	Putting ChatGPT’s Medical Advice to the (Turing) Test: Survey Study
title_short	Putting ChatGPT’s Medical Advice to the (Turing) Test: Survey Study
title_sort	putting chatgpt’s medical advice to the (turing) test: survey study
topic	Original Paper
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10366957/ https://www.ncbi.nlm.nih.gov/pubmed/37428540 http://dx.doi.org/10.2196/46939
work_keys_str_mv	AT novoded puttingchatgptsmedicaladvicetotheturingtestsurveystudy AT singhnina puttingchatgptsmedicaladvicetotheturingtestsurveystudy AT manndevin puttingchatgptsmedicaladvicetotheturingtestsurveystudy

Putting ChatGPT’s Medical Advice to the (Turing) Test: Survey Study

Ejemplares similares