Cargando…

Clinical Research With Large Language Models Generated Writing—Clinical Research with AI-assisted Writing (CRAW) Study

IMPORTANCE: The scientific community debates Generative Pre-trained Transformer (GPT)-3.5’s article quality, authorship merit, originality, and ethical use in scientific writing. OBJECTIVES: Assess GPT-3.5’s ability to craft the background section of critical care clinical research questions compare...

Descripción completa

Detalles Bibliográficos
Autores principales: Huespe, Ivan A., Echeverri, Jorge, Khalid, Aisha, Carboni Bisso, Indalecio, Musso, Carlos G., Surani, Salim, Bansal, Vikas, Kashyap, Rahul
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Lippincott Williams & Wilkins 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10547240/
https://www.ncbi.nlm.nih.gov/pubmed/37795455
http://dx.doi.org/10.1097/CCE.0000000000000975
_version_ 1785115018621091840
author Huespe, Ivan A.
Echeverri, Jorge
Khalid, Aisha
Carboni Bisso, Indalecio
Musso, Carlos G.
Surani, Salim
Bansal, Vikas
Kashyap, Rahul
author_facet Huespe, Ivan A.
Echeverri, Jorge
Khalid, Aisha
Carboni Bisso, Indalecio
Musso, Carlos G.
Surani, Salim
Bansal, Vikas
Kashyap, Rahul
author_sort Huespe, Ivan A.
collection PubMed
description IMPORTANCE: The scientific community debates Generative Pre-trained Transformer (GPT)-3.5’s article quality, authorship merit, originality, and ethical use in scientific writing. OBJECTIVES: Assess GPT-3.5’s ability to craft the background section of critical care clinical research questions compared to medical researchers with H-indices of 22 and 13. DESIGN: Observational cross-sectional study. SETTING: Researchers from 20 countries from six continents evaluated the backgrounds. PARTICIPANTS: Researchers with a Scopus index greater than 1 were included. MAIN OUTCOMES AND MEASURES: In this study, we generated a background section of a critical care clinical research question on “acute kidney injury in sepsis” using three different methods: researcher with H-index greater than 20, researcher with H-index greater than 10, and GPT-3.5. The three background sections were presented in a blinded survey to researchers with an H-index range between 1 and 96. First, the researchers evaluated the main components of the background using a 5-point Likert scale. Second, they were asked to identify which background was written by humans only or with large language model-generated tools. RESULTS: A total of 80 researchers completed the survey. The median H-index was 3 (interquartile range, 1–7.25) and most (36%) researchers were from the Critical Care specialty. When compared with researchers with an H-index of 22 and 13, GPT-3.5 was marked high on the Likert scale ranking on main background components (median 4.5 vs. 3.82 vs. 3.6 vs. 4.5, respectively; p < 0.001). The sensitivity and specificity to detect researchers writing versus GPT-3.5 writing were poor, 22.4% and 57.6%, respectively. CONCLUSIONS AND RELEVANCE: GPT-3.5 could create background research content indistinguishable from the writing of a medical researcher. It was marked higher compared with medical researchers with an H-index of 22 and 13 in writing the background section of a critical care clinical research question.
format Online
Article
Text
id pubmed-10547240
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Lippincott Williams & Wilkins
record_format MEDLINE/PubMed
spelling pubmed-105472402023-10-04 Clinical Research With Large Language Models Generated Writing—Clinical Research with AI-assisted Writing (CRAW) Study Huespe, Ivan A. Echeverri, Jorge Khalid, Aisha Carboni Bisso, Indalecio Musso, Carlos G. Surani, Salim Bansal, Vikas Kashyap, Rahul Crit Care Explor Observational Study IMPORTANCE: The scientific community debates Generative Pre-trained Transformer (GPT)-3.5’s article quality, authorship merit, originality, and ethical use in scientific writing. OBJECTIVES: Assess GPT-3.5’s ability to craft the background section of critical care clinical research questions compared to medical researchers with H-indices of 22 and 13. DESIGN: Observational cross-sectional study. SETTING: Researchers from 20 countries from six continents evaluated the backgrounds. PARTICIPANTS: Researchers with a Scopus index greater than 1 were included. MAIN OUTCOMES AND MEASURES: In this study, we generated a background section of a critical care clinical research question on “acute kidney injury in sepsis” using three different methods: researcher with H-index greater than 20, researcher with H-index greater than 10, and GPT-3.5. The three background sections were presented in a blinded survey to researchers with an H-index range between 1 and 96. First, the researchers evaluated the main components of the background using a 5-point Likert scale. Second, they were asked to identify which background was written by humans only or with large language model-generated tools. RESULTS: A total of 80 researchers completed the survey. The median H-index was 3 (interquartile range, 1–7.25) and most (36%) researchers were from the Critical Care specialty. When compared with researchers with an H-index of 22 and 13, GPT-3.5 was marked high on the Likert scale ranking on main background components (median 4.5 vs. 3.82 vs. 3.6 vs. 4.5, respectively; p < 0.001). The sensitivity and specificity to detect researchers writing versus GPT-3.5 writing were poor, 22.4% and 57.6%, respectively. CONCLUSIONS AND RELEVANCE: GPT-3.5 could create background research content indistinguishable from the writing of a medical researcher. It was marked higher compared with medical researchers with an H-index of 22 and 13 in writing the background section of a critical care clinical research question. Lippincott Williams & Wilkins 2023-10-02 /pmc/articles/PMC10547240/ /pubmed/37795455 http://dx.doi.org/10.1097/CCE.0000000000000975 Text en Copyright © 2023 The Authors. Published by Wolters Kluwer Health, Inc. on behalf of the Society of Critical Care Medicine. https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution-Non Commercial-No Derivatives License 4.0 (CCBY-NC-ND) (https://creativecommons.org/licenses/by-nc-nd/4.0/) , where it is permissible to download and share the work provided it is properly cited. The work cannot be changed in any way or used commercially without permission from the journal.
spellingShingle Observational Study
Huespe, Ivan A.
Echeverri, Jorge
Khalid, Aisha
Carboni Bisso, Indalecio
Musso, Carlos G.
Surani, Salim
Bansal, Vikas
Kashyap, Rahul
Clinical Research With Large Language Models Generated Writing—Clinical Research with AI-assisted Writing (CRAW) Study
title Clinical Research With Large Language Models Generated Writing—Clinical Research with AI-assisted Writing (CRAW) Study
title_full Clinical Research With Large Language Models Generated Writing—Clinical Research with AI-assisted Writing (CRAW) Study
title_fullStr Clinical Research With Large Language Models Generated Writing—Clinical Research with AI-assisted Writing (CRAW) Study
title_full_unstemmed Clinical Research With Large Language Models Generated Writing—Clinical Research with AI-assisted Writing (CRAW) Study
title_short Clinical Research With Large Language Models Generated Writing—Clinical Research with AI-assisted Writing (CRAW) Study
title_sort clinical research with large language models generated writing—clinical research with ai-assisted writing (craw) study
topic Observational Study
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10547240/
https://www.ncbi.nlm.nih.gov/pubmed/37795455
http://dx.doi.org/10.1097/CCE.0000000000000975
work_keys_str_mv AT huespeivana clinicalresearchwithlargelanguagemodelsgeneratedwritingclinicalresearchwithaiassistedwritingcrawstudy
AT echeverrijorge clinicalresearchwithlargelanguagemodelsgeneratedwritingclinicalresearchwithaiassistedwritingcrawstudy
AT khalidaisha clinicalresearchwithlargelanguagemodelsgeneratedwritingclinicalresearchwithaiassistedwritingcrawstudy
AT carbonibissoindalecio clinicalresearchwithlargelanguagemodelsgeneratedwritingclinicalresearchwithaiassistedwritingcrawstudy
AT mussocarlosg clinicalresearchwithlargelanguagemodelsgeneratedwritingclinicalresearchwithaiassistedwritingcrawstudy
AT suranisalim clinicalresearchwithlargelanguagemodelsgeneratedwritingclinicalresearchwithaiassistedwritingcrawstudy
AT bansalvikas clinicalresearchwithlargelanguagemodelsgeneratedwritingclinicalresearchwithaiassistedwritingcrawstudy
AT kashyaprahul clinicalresearchwithlargelanguagemodelsgeneratedwritingclinicalresearchwithaiassistedwritingcrawstudy