Cargando…
High Rates of Fabricated and Inaccurate References in ChatGPT-Generated Medical Content
Background The availability of large language models such as Chat Generative Pre-trained Transformer (ChatGPT, OpenAI) has enabled individuals from diverse backgrounds to access medical information. However, concerns exist about the accuracy of ChatGPT responses and the references used to generate m...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cureus
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10277170/ https://www.ncbi.nlm.nih.gov/pubmed/37337480 http://dx.doi.org/10.7759/cureus.39238 |
_version_ | 1785060233188474880 |
---|---|
author | Bhattacharyya, Mehul Miller, Valerie M Bhattacharyya, Debjani Miller, Larry E |
author_facet | Bhattacharyya, Mehul Miller, Valerie M Bhattacharyya, Debjani Miller, Larry E |
author_sort | Bhattacharyya, Mehul |
collection | PubMed |
description | Background The availability of large language models such as Chat Generative Pre-trained Transformer (ChatGPT, OpenAI) has enabled individuals from diverse backgrounds to access medical information. However, concerns exist about the accuracy of ChatGPT responses and the references used to generate medical content. Methods This observational study investigated the authenticity and accuracy of references in medical articles generated by ChatGPT. ChatGPT-3.5 generated 30 short medical papers, each with at least three references, based on standardized prompts encompassing various topics and therapeutic areas. Reference authenticity and accuracy were verified by searching Medline, Google Scholar, and the Directory of Open Access Journals. The authenticity and accuracy of individual ChatGPT-generated reference elements were also determined. Results Overall, 115 references were generated by ChatGPT, with a mean of 3.8±1.1 per paper. Among these references, 47% were fabricated, 46% were authentic but inaccurate, and only 7% were authentic and accurate. The likelihood of fabricated references significantly differed based on prompt variations; yet the frequency of authentic and accurate references remained low in all cases. Among the seven components evaluated for each reference, an incorrect PMID number was most common, listed in 93% of papers. Incorrect volume (64%), page numbers (64%), and year of publication (60%) were the next most frequent errors. The mean number of inaccurate components was 4.3±2.8 out of seven per reference. Conclusions The findings of this study emphasize the need for caution when seeking medical information on ChatGPT since most of the references provided were found to be fabricated or inaccurate. Individuals are advised to verify medical information from reliable sources and avoid relying solely on artificial intelligence-generated content. |
format | Online Article Text |
id | pubmed-10277170 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Cureus |
record_format | MEDLINE/PubMed |
spelling | pubmed-102771702023-06-19 High Rates of Fabricated and Inaccurate References in ChatGPT-Generated Medical Content Bhattacharyya, Mehul Miller, Valerie M Bhattacharyya, Debjani Miller, Larry E Cureus Medical Education Background The availability of large language models such as Chat Generative Pre-trained Transformer (ChatGPT, OpenAI) has enabled individuals from diverse backgrounds to access medical information. However, concerns exist about the accuracy of ChatGPT responses and the references used to generate medical content. Methods This observational study investigated the authenticity and accuracy of references in medical articles generated by ChatGPT. ChatGPT-3.5 generated 30 short medical papers, each with at least three references, based on standardized prompts encompassing various topics and therapeutic areas. Reference authenticity and accuracy were verified by searching Medline, Google Scholar, and the Directory of Open Access Journals. The authenticity and accuracy of individual ChatGPT-generated reference elements were also determined. Results Overall, 115 references were generated by ChatGPT, with a mean of 3.8±1.1 per paper. Among these references, 47% were fabricated, 46% were authentic but inaccurate, and only 7% were authentic and accurate. The likelihood of fabricated references significantly differed based on prompt variations; yet the frequency of authentic and accurate references remained low in all cases. Among the seven components evaluated for each reference, an incorrect PMID number was most common, listed in 93% of papers. Incorrect volume (64%), page numbers (64%), and year of publication (60%) were the next most frequent errors. The mean number of inaccurate components was 4.3±2.8 out of seven per reference. Conclusions The findings of this study emphasize the need for caution when seeking medical information on ChatGPT since most of the references provided were found to be fabricated or inaccurate. Individuals are advised to verify medical information from reliable sources and avoid relying solely on artificial intelligence-generated content. Cureus 2023-05-19 /pmc/articles/PMC10277170/ /pubmed/37337480 http://dx.doi.org/10.7759/cureus.39238 Text en Copyright © 2023, Bhattacharyya et al. https://creativecommons.org/licenses/by/3.0/This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Medical Education Bhattacharyya, Mehul Miller, Valerie M Bhattacharyya, Debjani Miller, Larry E High Rates of Fabricated and Inaccurate References in ChatGPT-Generated Medical Content |
title | High Rates of Fabricated and Inaccurate References in ChatGPT-Generated Medical Content |
title_full | High Rates of Fabricated and Inaccurate References in ChatGPT-Generated Medical Content |
title_fullStr | High Rates of Fabricated and Inaccurate References in ChatGPT-Generated Medical Content |
title_full_unstemmed | High Rates of Fabricated and Inaccurate References in ChatGPT-Generated Medical Content |
title_short | High Rates of Fabricated and Inaccurate References in ChatGPT-Generated Medical Content |
title_sort | high rates of fabricated and inaccurate references in chatgpt-generated medical content |
topic | Medical Education |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10277170/ https://www.ncbi.nlm.nih.gov/pubmed/37337480 http://dx.doi.org/10.7759/cureus.39238 |
work_keys_str_mv | AT bhattacharyyamehul highratesoffabricatedandinaccuratereferencesinchatgptgeneratedmedicalcontent AT millervaleriem highratesoffabricatedandinaccuratereferencesinchatgptgeneratedmedicalcontent AT bhattacharyyadebjani highratesoffabricatedandinaccuratereferencesinchatgptgeneratedmedicalcontent AT millerlarrye highratesoffabricatedandinaccuratereferencesinchatgptgeneratedmedicalcontent |