Cargando…
Evaluation of Artificial Intelligence–generated Responses to Common Plastic Surgery Questions
BACKGROUND: Artificial intelligence (AI) is increasingly used to answer questions, yet the accuracy and validity of current tools are uncertain. In contrast to internet queries, AI generates summary responses as definitive. The internet is rife with inaccuracies, and plastic surgery management guide...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Lippincott Williams & Wilkins
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10468106/ https://www.ncbi.nlm.nih.gov/pubmed/37654681 http://dx.doi.org/10.1097/GOX.0000000000005226 |
_version_ | 1785099173591252992 |
---|---|
author | Copeland-Halperin, Libby R. O’Brien, Lauren Copeland, Michelle |
author_facet | Copeland-Halperin, Libby R. O’Brien, Lauren Copeland, Michelle |
author_sort | Copeland-Halperin, Libby R. |
collection | PubMed |
description | BACKGROUND: Artificial intelligence (AI) is increasingly used to answer questions, yet the accuracy and validity of current tools are uncertain. In contrast to internet queries, AI generates summary responses as definitive. The internet is rife with inaccuracies, and plastic surgery management guidelines evolve, making verifiable information important. METHODS: We posed 10 questions about breast implant-associated illness, anaplastic large lymphoma, and squamous carcinoma to Bing, using the “more balanced” option, and to ChatGPT. Answers were reviewed by two plastic surgeons for accuracy and fidelity to information on the Food and Drug Administration (FDA) and American Society of Plastic Surgeons (ASPS) websites. We also presented 10 multiple-choice questions from the 2022 plastic surgery in-service examination to Bing, using the “more precise” option, and ChatGPT. Questions were repeated three times over consecutive weeks, and answers were evaluated for accuracy and stability. RESULTS: Compared with answers from the FDA and ASPS, Bing and ChatGPT were accurate. Bing answered 10 of the 30 multiple-choice questions correctly, nine incorrectly, and did not answer 11. ChatGPT correctly answered 16 and incorrectly answered 14. In both parts, responses from Bing were shorter, less detailed, and referred to verified and unverified sources; ChatGPT did not provide citations. CONCLUSIONS: These AI tools provided accurate information from the FDA and ASPS websites, but neither consistently answered questions requiring nuanced decision-making correctly. Advances in applications to plastic surgery will require algorithms that selectively identify, evaluate, and exclude information to enhance the accuracy, precision, validity, reliability, and utility of AI-generated responses. |
format | Online Article Text |
id | pubmed-10468106 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Lippincott Williams & Wilkins |
record_format | MEDLINE/PubMed |
spelling | pubmed-104681062023-08-31 Evaluation of Artificial Intelligence–generated Responses to Common Plastic Surgery Questions Copeland-Halperin, Libby R. O’Brien, Lauren Copeland, Michelle Plast Reconstr Surg Glob Open Technology BACKGROUND: Artificial intelligence (AI) is increasingly used to answer questions, yet the accuracy and validity of current tools are uncertain. In contrast to internet queries, AI generates summary responses as definitive. The internet is rife with inaccuracies, and plastic surgery management guidelines evolve, making verifiable information important. METHODS: We posed 10 questions about breast implant-associated illness, anaplastic large lymphoma, and squamous carcinoma to Bing, using the “more balanced” option, and to ChatGPT. Answers were reviewed by two plastic surgeons for accuracy and fidelity to information on the Food and Drug Administration (FDA) and American Society of Plastic Surgeons (ASPS) websites. We also presented 10 multiple-choice questions from the 2022 plastic surgery in-service examination to Bing, using the “more precise” option, and ChatGPT. Questions were repeated three times over consecutive weeks, and answers were evaluated for accuracy and stability. RESULTS: Compared with answers from the FDA and ASPS, Bing and ChatGPT were accurate. Bing answered 10 of the 30 multiple-choice questions correctly, nine incorrectly, and did not answer 11. ChatGPT correctly answered 16 and incorrectly answered 14. In both parts, responses from Bing were shorter, less detailed, and referred to verified and unverified sources; ChatGPT did not provide citations. CONCLUSIONS: These AI tools provided accurate information from the FDA and ASPS websites, but neither consistently answered questions requiring nuanced decision-making correctly. Advances in applications to plastic surgery will require algorithms that selectively identify, evaluate, and exclude information to enhance the accuracy, precision, validity, reliability, and utility of AI-generated responses. Lippincott Williams & Wilkins 2023-08-30 /pmc/articles/PMC10468106/ /pubmed/37654681 http://dx.doi.org/10.1097/GOX.0000000000005226 Text en Copyright © 2023 The Authors. Published by Wolters Kluwer Health, Inc. on behalf of The American Society of Plastic Surgeons. https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution-Non Commercial-No Derivatives License 4.0 (CCBY-NC-ND) (https://creativecommons.org/licenses/by-nc-nd/4.0/) , where it is permissible to download and share the work provided it is properly cited. The work cannot be changed in any way or used commercially without permission from the journal. |
spellingShingle | Technology Copeland-Halperin, Libby R. O’Brien, Lauren Copeland, Michelle Evaluation of Artificial Intelligence–generated Responses to Common Plastic Surgery Questions |
title | Evaluation of Artificial Intelligence–generated Responses to Common Plastic Surgery Questions |
title_full | Evaluation of Artificial Intelligence–generated Responses to Common Plastic Surgery Questions |
title_fullStr | Evaluation of Artificial Intelligence–generated Responses to Common Plastic Surgery Questions |
title_full_unstemmed | Evaluation of Artificial Intelligence–generated Responses to Common Plastic Surgery Questions |
title_short | Evaluation of Artificial Intelligence–generated Responses to Common Plastic Surgery Questions |
title_sort | evaluation of artificial intelligence–generated responses to common plastic surgery questions |
topic | Technology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10468106/ https://www.ncbi.nlm.nih.gov/pubmed/37654681 http://dx.doi.org/10.1097/GOX.0000000000005226 |
work_keys_str_mv | AT copelandhalperinlibbyr evaluationofartificialintelligencegeneratedresponsestocommonplasticsurgeryquestions AT obrienlauren evaluationofartificialintelligencegeneratedresponsestocommonplasticsurgeryquestions AT copelandmichelle evaluationofartificialintelligencegeneratedresponsestocommonplasticsurgeryquestions |