Cargando…

Clinical Research With Large Language Models Generated Writing—Clinical Research with AI-assisted Writing (CRAW) Study

IMPORTANCE: The scientific community debates Generative Pre-trained Transformer (GPT)-3.5’s article quality, authorship merit, originality, and ethical use in scientific writing. OBJECTIVES: Assess GPT-3.5’s ability to craft the background section of critical care clinical research questions compare...

Descripción completa

Detalles Bibliográficos
Autores principales:	Huespe, Ivan A., Echeverri, Jorge, Khalid, Aisha, Carboni Bisso, Indalecio, Musso, Carlos G., Surani, Salim, Bansal, Vikas, Kashyap, Rahul
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Lippincott Williams & Wilkins 2023
Materias:	Observational Study
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10547240/ https://www.ncbi.nlm.nih.gov/pubmed/37795455 http://dx.doi.org/10.1097/CCE.0000000000000975

_version_	1785115018621091840
author	Huespe, Ivan A. Echeverri, Jorge Khalid, Aisha Carboni Bisso, Indalecio Musso, Carlos G. Surani, Salim Bansal, Vikas Kashyap, Rahul
author_facet	Huespe, Ivan A. Echeverri, Jorge Khalid, Aisha Carboni Bisso, Indalecio Musso, Carlos G. Surani, Salim Bansal, Vikas Kashyap, Rahul
author_sort	Huespe, Ivan A.
collection	PubMed
description	IMPORTANCE: The scientific community debates Generative Pre-trained Transformer (GPT)-3.5’s article quality, authorship merit, originality, and ethical use in scientific writing. OBJECTIVES: Assess GPT-3.5’s ability to craft the background section of critical care clinical research questions compared to medical researchers with H-indices of 22 and 13. DESIGN: Observational cross-sectional study. SETTING: Researchers from 20 countries from six continents evaluated the backgrounds. PARTICIPANTS: Researchers with a Scopus index greater than 1 were included. MAIN OUTCOMES AND MEASURES: In this study, we generated a background section of a critical care clinical research question on “acute kidney injury in sepsis” using three different methods: researcher with H-index greater than 20, researcher with H-index greater than 10, and GPT-3.5. The three background sections were presented in a blinded survey to researchers with an H-index range between 1 and 96. First, the researchers evaluated the main components of the background using a 5-point Likert scale. Second, they were asked to identify which background was written by humans only or with large language model-generated tools. RESULTS: A total of 80 researchers completed the survey. The median H-index was 3 (interquartile range, 1–7.25) and most (36%) researchers were from the Critical Care specialty. When compared with researchers with an H-index of 22 and 13, GPT-3.5 was marked high on the Likert scale ranking on main background components (median 4.5 vs. 3.82 vs. 3.6 vs. 4.5, respectively; p < 0.001). The sensitivity and specificity to detect researchers writing versus GPT-3.5 writing were poor, 22.4% and 57.6%, respectively. CONCLUSIONS AND RELEVANCE: GPT-3.5 could create background research content indistinguishable from the writing of a medical researcher. It was marked higher compared with medical researchers with an H-index of 22 and 13 in writing the background section of a critical care clinical research question.
format	Online Article Text
id	pubmed-10547240
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Lippincott Williams & Wilkins
record_format	MEDLINE/PubMed
spelling	pubmed-105472402023-10-04 Clinical Research With Large Language Models Generated Writing—Clinical Research with AI-assisted Writing (CRAW) Study Huespe, Ivan A. Echeverri, Jorge Khalid, Aisha Carboni Bisso, Indalecio Musso, Carlos G. Surani, Salim Bansal, Vikas Kashyap, Rahul Crit Care Explor Observational Study IMPORTANCE: The scientific community debates Generative Pre-trained Transformer (GPT)-3.5’s article quality, authorship merit, originality, and ethical use in scientific writing. OBJECTIVES: Assess GPT-3.5’s ability to craft the background section of critical care clinical research questions compared to medical researchers with H-indices of 22 and 13. DESIGN: Observational cross-sectional study. SETTING: Researchers from 20 countries from six continents evaluated the backgrounds. PARTICIPANTS: Researchers with a Scopus index greater than 1 were included. MAIN OUTCOMES AND MEASURES: In this study, we generated a background section of a critical care clinical research question on “acute kidney injury in sepsis” using three different methods: researcher with H-index greater than 20, researcher with H-index greater than 10, and GPT-3.5. The three background sections were presented in a blinded survey to researchers with an H-index range between 1 and 96. First, the researchers evaluated the main components of the background using a 5-point Likert scale. Second, they were asked to identify which background was written by humans only or with large language model-generated tools. RESULTS: A total of 80 researchers completed the survey. The median H-index was 3 (interquartile range, 1–7.25) and most (36%) researchers were from the Critical Care specialty. When compared with researchers with an H-index of 22 and 13, GPT-3.5 was marked high on the Likert scale ranking on main background components (median 4.5 vs. 3.82 vs. 3.6 vs. 4.5, respectively; p < 0.001). The sensitivity and specificity to detect researchers writing versus GPT-3.5 writing were poor, 22.4% and 57.6%, respectively. CONCLUSIONS AND RELEVANCE: GPT-3.5 could create background research content indistinguishable from the writing of a medical researcher. It was marked higher compared with medical researchers with an H-index of 22 and 13 in writing the background section of a critical care clinical research question. Lippincott Williams & Wilkins 2023-10-02 /pmc/articles/PMC10547240/ /pubmed/37795455 http://dx.doi.org/10.1097/CCE.0000000000000975 Text en Copyright © 2023 The Authors. Published by Wolters Kluwer Health, Inc. on behalf of the Society of Critical Care Medicine. https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution-Non Commercial-No Derivatives License 4.0 (CCBY-NC-ND) (https://creativecommons.org/licenses/by-nc-nd/4.0/) , where it is permissible to download and share the work provided it is properly cited. The work cannot be changed in any way or used commercially without permission from the journal.
spellingShingle	Observational Study Huespe, Ivan A. Echeverri, Jorge Khalid, Aisha Carboni Bisso, Indalecio Musso, Carlos G. Surani, Salim Bansal, Vikas Kashyap, Rahul Clinical Research With Large Language Models Generated Writing—Clinical Research with AI-assisted Writing (CRAW) Study
title	Clinical Research With Large Language Models Generated Writing—Clinical Research with AI-assisted Writing (CRAW) Study
title_full	Clinical Research With Large Language Models Generated Writing—Clinical Research with AI-assisted Writing (CRAW) Study
title_fullStr	Clinical Research With Large Language Models Generated Writing—Clinical Research with AI-assisted Writing (CRAW) Study
title_full_unstemmed	Clinical Research With Large Language Models Generated Writing—Clinical Research with AI-assisted Writing (CRAW) Study
title_short	Clinical Research With Large Language Models Generated Writing—Clinical Research with AI-assisted Writing (CRAW) Study
title_sort	clinical research with large language models generated writing—clinical research with ai-assisted writing (craw) study
topic	Observational Study
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10547240/ https://www.ncbi.nlm.nih.gov/pubmed/37795455 http://dx.doi.org/10.1097/CCE.0000000000000975
work_keys_str_mv	AT huespeivana clinicalresearchwithlargelanguagemodelsgeneratedwritingclinicalresearchwithaiassistedwritingcrawstudy AT echeverrijorge clinicalresearchwithlargelanguagemodelsgeneratedwritingclinicalresearchwithaiassistedwritingcrawstudy AT khalidaisha clinicalresearchwithlargelanguagemodelsgeneratedwritingclinicalresearchwithaiassistedwritingcrawstudy AT carbonibissoindalecio clinicalresearchwithlargelanguagemodelsgeneratedwritingclinicalresearchwithaiassistedwritingcrawstudy AT mussocarlosg clinicalresearchwithlargelanguagemodelsgeneratedwritingclinicalresearchwithaiassistedwritingcrawstudy AT suranisalim clinicalresearchwithlargelanguagemodelsgeneratedwritingclinicalresearchwithaiassistedwritingcrawstudy AT bansalvikas clinicalresearchwithlargelanguagemodelsgeneratedwritingclinicalresearchwithaiassistedwritingcrawstudy AT kashyaprahul clinicalresearchwithlargelanguagemodelsgeneratedwritingclinicalresearchwithaiassistedwritingcrawstudy

Clinical Research With Large Language Models Generated Writing—Clinical Research with AI-assisted Writing (CRAW) Study

Ejemplares similares