Cargando…

Evaluation of Artificial Intelligence–generated Responses to Common Plastic Surgery Questions

BACKGROUND: Artificial intelligence (AI) is increasingly used to answer questions, yet the accuracy and validity of current tools are uncertain. In contrast to internet queries, AI generates summary responses as definitive. The internet is rife with inaccuracies, and plastic surgery management guide...

Descripción completa

Detalles Bibliográficos
Autores principales: Copeland-Halperin, Libby R., O’Brien, Lauren, Copeland, Michelle
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Lippincott Williams & Wilkins 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10468106/
https://www.ncbi.nlm.nih.gov/pubmed/37654681
http://dx.doi.org/10.1097/GOX.0000000000005226
_version_ 1785099173591252992
author Copeland-Halperin, Libby R.
O’Brien, Lauren
Copeland, Michelle
author_facet Copeland-Halperin, Libby R.
O’Brien, Lauren
Copeland, Michelle
author_sort Copeland-Halperin, Libby R.
collection PubMed
description BACKGROUND: Artificial intelligence (AI) is increasingly used to answer questions, yet the accuracy and validity of current tools are uncertain. In contrast to internet queries, AI generates summary responses as definitive. The internet is rife with inaccuracies, and plastic surgery management guidelines evolve, making verifiable information important. METHODS: We posed 10 questions about breast implant-associated illness, anaplastic large lymphoma, and squamous carcinoma to Bing, using the “more balanced” option, and to ChatGPT. Answers were reviewed by two plastic surgeons for accuracy and fidelity to information on the Food and Drug Administration (FDA) and American Society of Plastic Surgeons (ASPS) websites. We also presented 10 multiple-choice questions from the 2022 plastic surgery in-service examination to Bing, using the “more precise” option, and ChatGPT. Questions were repeated three times over consecutive weeks, and answers were evaluated for accuracy and stability. RESULTS: Compared with answers from the FDA and ASPS, Bing and ChatGPT were accurate. Bing answered 10 of the 30 multiple-choice questions correctly, nine incorrectly, and did not answer 11. ChatGPT correctly answered 16 and incorrectly answered 14. In both parts, responses from Bing were shorter, less detailed, and referred to verified and unverified sources; ChatGPT did not provide citations. CONCLUSIONS: These AI tools provided accurate information from the FDA and ASPS websites, but neither consistently answered questions requiring nuanced decision-making correctly. Advances in applications to plastic surgery will require algorithms that selectively identify, evaluate, and exclude information to enhance the accuracy, precision, validity, reliability, and utility of AI-generated responses.
format Online
Article
Text
id pubmed-10468106
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Lippincott Williams & Wilkins
record_format MEDLINE/PubMed
spelling pubmed-104681062023-08-31 Evaluation of Artificial Intelligence–generated Responses to Common Plastic Surgery Questions Copeland-Halperin, Libby R. O’Brien, Lauren Copeland, Michelle Plast Reconstr Surg Glob Open Technology BACKGROUND: Artificial intelligence (AI) is increasingly used to answer questions, yet the accuracy and validity of current tools are uncertain. In contrast to internet queries, AI generates summary responses as definitive. The internet is rife with inaccuracies, and plastic surgery management guidelines evolve, making verifiable information important. METHODS: We posed 10 questions about breast implant-associated illness, anaplastic large lymphoma, and squamous carcinoma to Bing, using the “more balanced” option, and to ChatGPT. Answers were reviewed by two plastic surgeons for accuracy and fidelity to information on the Food and Drug Administration (FDA) and American Society of Plastic Surgeons (ASPS) websites. We also presented 10 multiple-choice questions from the 2022 plastic surgery in-service examination to Bing, using the “more precise” option, and ChatGPT. Questions were repeated three times over consecutive weeks, and answers were evaluated for accuracy and stability. RESULTS: Compared with answers from the FDA and ASPS, Bing and ChatGPT were accurate. Bing answered 10 of the 30 multiple-choice questions correctly, nine incorrectly, and did not answer 11. ChatGPT correctly answered 16 and incorrectly answered 14. In both parts, responses from Bing were shorter, less detailed, and referred to verified and unverified sources; ChatGPT did not provide citations. CONCLUSIONS: These AI tools provided accurate information from the FDA and ASPS websites, but neither consistently answered questions requiring nuanced decision-making correctly. Advances in applications to plastic surgery will require algorithms that selectively identify, evaluate, and exclude information to enhance the accuracy, precision, validity, reliability, and utility of AI-generated responses. Lippincott Williams & Wilkins 2023-08-30 /pmc/articles/PMC10468106/ /pubmed/37654681 http://dx.doi.org/10.1097/GOX.0000000000005226 Text en Copyright © 2023 The Authors. Published by Wolters Kluwer Health, Inc. on behalf of The American Society of Plastic Surgeons. https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution-Non Commercial-No Derivatives License 4.0 (CCBY-NC-ND) (https://creativecommons.org/licenses/by-nc-nd/4.0/) , where it is permissible to download and share the work provided it is properly cited. The work cannot be changed in any way or used commercially without permission from the journal.
spellingShingle Technology
Copeland-Halperin, Libby R.
O’Brien, Lauren
Copeland, Michelle
Evaluation of Artificial Intelligence–generated Responses to Common Plastic Surgery Questions
title Evaluation of Artificial Intelligence–generated Responses to Common Plastic Surgery Questions
title_full Evaluation of Artificial Intelligence–generated Responses to Common Plastic Surgery Questions
title_fullStr Evaluation of Artificial Intelligence–generated Responses to Common Plastic Surgery Questions
title_full_unstemmed Evaluation of Artificial Intelligence–generated Responses to Common Plastic Surgery Questions
title_short Evaluation of Artificial Intelligence–generated Responses to Common Plastic Surgery Questions
title_sort evaluation of artificial intelligence–generated responses to common plastic surgery questions
topic Technology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10468106/
https://www.ncbi.nlm.nih.gov/pubmed/37654681
http://dx.doi.org/10.1097/GOX.0000000000005226
work_keys_str_mv AT copelandhalperinlibbyr evaluationofartificialintelligencegeneratedresponsestocommonplasticsurgeryquestions
AT obrienlauren evaluationofartificialintelligencegeneratedresponsestocommonplasticsurgeryquestions
AT copelandmichelle evaluationofartificialintelligencegeneratedresponsestocommonplasticsurgeryquestions