Cargando…

A pilot study on the efficacy of GPT-4 in providing orthopedic treatment recommendations from MRI reports

Large language models (LLMs) have shown potential in various applications, including clinical practice. However, their accuracy and utility in providing treatment recommendations for orthopedic conditions remain to be investigated. Thus, this pilot study aims to evaluate the validity of treatment re...

Descripción completa

Detalles Bibliográficos
Autores principales:	Truhn, Daniel, Weber, Christian D., Braun, Benedikt J., Bressem, Keno, Kather, Jakob N., Kuhl, Christiane, Nebelung, Sven
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Nature Publishing Group UK 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10656559/ https://www.ncbi.nlm.nih.gov/pubmed/37978240 http://dx.doi.org/10.1038/s41598-023-47500-2

_version_	1785148057772359680
author	Truhn, Daniel Weber, Christian D. Braun, Benedikt J. Bressem, Keno Kather, Jakob N. Kuhl, Christiane Nebelung, Sven
author_facet	Truhn, Daniel Weber, Christian D. Braun, Benedikt J. Bressem, Keno Kather, Jakob N. Kuhl, Christiane Nebelung, Sven
author_sort	Truhn, Daniel
collection	PubMed
description	Large language models (LLMs) have shown potential in various applications, including clinical practice. However, their accuracy and utility in providing treatment recommendations for orthopedic conditions remain to be investigated. Thus, this pilot study aims to evaluate the validity of treatment recommendations generated by GPT-4 for common knee and shoulder orthopedic conditions using anonymized clinical MRI reports. A retrospective analysis was conducted using 20 anonymized clinical MRI reports, with varying severity and complexity. Treatment recommendations were elicited from GPT-4 and evaluated by two board-certified specialty-trained senior orthopedic surgeons. Their evaluation focused on semiquantitative gradings of accuracy and clinical utility and potential limitations of the LLM-generated recommendations. GPT-4 provided treatment recommendations for 20 patients (mean age, 50 years ± 19 [standard deviation]; 12 men) with acute and chronic knee and shoulder conditions. The LLM produced largely accurate and clinically useful recommendations. However, limited awareness of a patient’s overall situation, a tendency to incorrectly appreciate treatment urgency, and largely schematic and unspecific treatment recommendations were observed and may reduce its clinical usefulness. In conclusion, LLM-based treatment recommendations are largely adequate and not prone to ‘hallucinations’, yet inadequate in particular situations. Critical guidance by healthcare professionals is obligatory, and independent use by patients is discouraged, given the dependency on precise data input.
format	Online Article Text
id	pubmed-10656559
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Nature Publishing Group UK
record_format	MEDLINE/PubMed
spelling	pubmed-106565592023-11-17 A pilot study on the efficacy of GPT-4 in providing orthopedic treatment recommendations from MRI reports Truhn, Daniel Weber, Christian D. Braun, Benedikt J. Bressem, Keno Kather, Jakob N. Kuhl, Christiane Nebelung, Sven Sci Rep Article Large language models (LLMs) have shown potential in various applications, including clinical practice. However, their accuracy and utility in providing treatment recommendations for orthopedic conditions remain to be investigated. Thus, this pilot study aims to evaluate the validity of treatment recommendations generated by GPT-4 for common knee and shoulder orthopedic conditions using anonymized clinical MRI reports. A retrospective analysis was conducted using 20 anonymized clinical MRI reports, with varying severity and complexity. Treatment recommendations were elicited from GPT-4 and evaluated by two board-certified specialty-trained senior orthopedic surgeons. Their evaluation focused on semiquantitative gradings of accuracy and clinical utility and potential limitations of the LLM-generated recommendations. GPT-4 provided treatment recommendations for 20 patients (mean age, 50 years ± 19 [standard deviation]; 12 men) with acute and chronic knee and shoulder conditions. The LLM produced largely accurate and clinically useful recommendations. However, limited awareness of a patient’s overall situation, a tendency to incorrectly appreciate treatment urgency, and largely schematic and unspecific treatment recommendations were observed and may reduce its clinical usefulness. In conclusion, LLM-based treatment recommendations are largely adequate and not prone to ‘hallucinations’, yet inadequate in particular situations. Critical guidance by healthcare professionals is obligatory, and independent use by patients is discouraged, given the dependency on precise data input. Nature Publishing Group UK 2023-11-17 /pmc/articles/PMC10656559/ /pubmed/37978240 http://dx.doi.org/10.1038/s41598-023-47500-2 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle	Article Truhn, Daniel Weber, Christian D. Braun, Benedikt J. Bressem, Keno Kather, Jakob N. Kuhl, Christiane Nebelung, Sven A pilot study on the efficacy of GPT-4 in providing orthopedic treatment recommendations from MRI reports
title	A pilot study on the efficacy of GPT-4 in providing orthopedic treatment recommendations from MRI reports
title_full	A pilot study on the efficacy of GPT-4 in providing orthopedic treatment recommendations from MRI reports
title_fullStr	A pilot study on the efficacy of GPT-4 in providing orthopedic treatment recommendations from MRI reports
title_full_unstemmed	A pilot study on the efficacy of GPT-4 in providing orthopedic treatment recommendations from MRI reports
title_short	A pilot study on the efficacy of GPT-4 in providing orthopedic treatment recommendations from MRI reports
title_sort	pilot study on the efficacy of gpt-4 in providing orthopedic treatment recommendations from mri reports
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10656559/ https://www.ncbi.nlm.nih.gov/pubmed/37978240 http://dx.doi.org/10.1038/s41598-023-47500-2
work_keys_str_mv	AT truhndaniel apilotstudyontheefficacyofgpt4inprovidingorthopedictreatmentrecommendationsfrommrireports AT weberchristiand apilotstudyontheefficacyofgpt4inprovidingorthopedictreatmentrecommendationsfrommrireports AT braunbenediktj apilotstudyontheefficacyofgpt4inprovidingorthopedictreatmentrecommendationsfrommrireports AT bressemkeno apilotstudyontheefficacyofgpt4inprovidingorthopedictreatmentrecommendationsfrommrireports AT katherjakobn apilotstudyontheefficacyofgpt4inprovidingorthopedictreatmentrecommendationsfrommrireports AT kuhlchristiane apilotstudyontheefficacyofgpt4inprovidingorthopedictreatmentrecommendationsfrommrireports AT nebelungsven apilotstudyontheefficacyofgpt4inprovidingorthopedictreatmentrecommendationsfrommrireports AT truhndaniel pilotstudyontheefficacyofgpt4inprovidingorthopedictreatmentrecommendationsfrommrireports AT weberchristiand pilotstudyontheefficacyofgpt4inprovidingorthopedictreatmentrecommendationsfrommrireports AT braunbenediktj pilotstudyontheefficacyofgpt4inprovidingorthopedictreatmentrecommendationsfrommrireports AT bressemkeno pilotstudyontheefficacyofgpt4inprovidingorthopedictreatmentrecommendationsfrommrireports AT katherjakobn pilotstudyontheefficacyofgpt4inprovidingorthopedictreatmentrecommendationsfrommrireports AT kuhlchristiane pilotstudyontheefficacyofgpt4inprovidingorthopedictreatmentrecommendationsfrommrireports AT nebelungsven pilotstudyontheefficacyofgpt4inprovidingorthopedictreatmentrecommendationsfrommrireports

A pilot study on the efficacy of GPT-4 in providing orthopedic treatment recommendations from MRI reports

Ejemplares similares