Cargando…
Clinical Research With Large Language Models Generated Writing—Clinical Research with AI-assisted Writing (CRAW) Study
IMPORTANCE: The scientific community debates Generative Pre-trained Transformer (GPT)-3.5’s article quality, authorship merit, originality, and ethical use in scientific writing. OBJECTIVES: Assess GPT-3.5’s ability to craft the background section of critical care clinical research questions compare...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Lippincott Williams & Wilkins
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10547240/ https://www.ncbi.nlm.nih.gov/pubmed/37795455 http://dx.doi.org/10.1097/CCE.0000000000000975 |
_version_ | 1785115018621091840 |
---|---|
author | Huespe, Ivan A. Echeverri, Jorge Khalid, Aisha Carboni Bisso, Indalecio Musso, Carlos G. Surani, Salim Bansal, Vikas Kashyap, Rahul |
author_facet | Huespe, Ivan A. Echeverri, Jorge Khalid, Aisha Carboni Bisso, Indalecio Musso, Carlos G. Surani, Salim Bansal, Vikas Kashyap, Rahul |
author_sort | Huespe, Ivan A. |
collection | PubMed |
description | IMPORTANCE: The scientific community debates Generative Pre-trained Transformer (GPT)-3.5’s article quality, authorship merit, originality, and ethical use in scientific writing. OBJECTIVES: Assess GPT-3.5’s ability to craft the background section of critical care clinical research questions compared to medical researchers with H-indices of 22 and 13. DESIGN: Observational cross-sectional study. SETTING: Researchers from 20 countries from six continents evaluated the backgrounds. PARTICIPANTS: Researchers with a Scopus index greater than 1 were included. MAIN OUTCOMES AND MEASURES: In this study, we generated a background section of a critical care clinical research question on “acute kidney injury in sepsis” using three different methods: researcher with H-index greater than 20, researcher with H-index greater than 10, and GPT-3.5. The three background sections were presented in a blinded survey to researchers with an H-index range between 1 and 96. First, the researchers evaluated the main components of the background using a 5-point Likert scale. Second, they were asked to identify which background was written by humans only or with large language model-generated tools. RESULTS: A total of 80 researchers completed the survey. The median H-index was 3 (interquartile range, 1–7.25) and most (36%) researchers were from the Critical Care specialty. When compared with researchers with an H-index of 22 and 13, GPT-3.5 was marked high on the Likert scale ranking on main background components (median 4.5 vs. 3.82 vs. 3.6 vs. 4.5, respectively; p < 0.001). The sensitivity and specificity to detect researchers writing versus GPT-3.5 writing were poor, 22.4% and 57.6%, respectively. CONCLUSIONS AND RELEVANCE: GPT-3.5 could create background research content indistinguishable from the writing of a medical researcher. It was marked higher compared with medical researchers with an H-index of 22 and 13 in writing the background section of a critical care clinical research question. |
format | Online Article Text |
id | pubmed-10547240 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Lippincott Williams & Wilkins |
record_format | MEDLINE/PubMed |
spelling | pubmed-105472402023-10-04 Clinical Research With Large Language Models Generated Writing—Clinical Research with AI-assisted Writing (CRAW) Study Huespe, Ivan A. Echeverri, Jorge Khalid, Aisha Carboni Bisso, Indalecio Musso, Carlos G. Surani, Salim Bansal, Vikas Kashyap, Rahul Crit Care Explor Observational Study IMPORTANCE: The scientific community debates Generative Pre-trained Transformer (GPT)-3.5’s article quality, authorship merit, originality, and ethical use in scientific writing. OBJECTIVES: Assess GPT-3.5’s ability to craft the background section of critical care clinical research questions compared to medical researchers with H-indices of 22 and 13. DESIGN: Observational cross-sectional study. SETTING: Researchers from 20 countries from six continents evaluated the backgrounds. PARTICIPANTS: Researchers with a Scopus index greater than 1 were included. MAIN OUTCOMES AND MEASURES: In this study, we generated a background section of a critical care clinical research question on “acute kidney injury in sepsis” using three different methods: researcher with H-index greater than 20, researcher with H-index greater than 10, and GPT-3.5. The three background sections were presented in a blinded survey to researchers with an H-index range between 1 and 96. First, the researchers evaluated the main components of the background using a 5-point Likert scale. Second, they were asked to identify which background was written by humans only or with large language model-generated tools. RESULTS: A total of 80 researchers completed the survey. The median H-index was 3 (interquartile range, 1–7.25) and most (36%) researchers were from the Critical Care specialty. When compared with researchers with an H-index of 22 and 13, GPT-3.5 was marked high on the Likert scale ranking on main background components (median 4.5 vs. 3.82 vs. 3.6 vs. 4.5, respectively; p < 0.001). The sensitivity and specificity to detect researchers writing versus GPT-3.5 writing were poor, 22.4% and 57.6%, respectively. CONCLUSIONS AND RELEVANCE: GPT-3.5 could create background research content indistinguishable from the writing of a medical researcher. It was marked higher compared with medical researchers with an H-index of 22 and 13 in writing the background section of a critical care clinical research question. Lippincott Williams & Wilkins 2023-10-02 /pmc/articles/PMC10547240/ /pubmed/37795455 http://dx.doi.org/10.1097/CCE.0000000000000975 Text en Copyright © 2023 The Authors. Published by Wolters Kluwer Health, Inc. on behalf of the Society of Critical Care Medicine. https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution-Non Commercial-No Derivatives License 4.0 (CCBY-NC-ND) (https://creativecommons.org/licenses/by-nc-nd/4.0/) , where it is permissible to download and share the work provided it is properly cited. The work cannot be changed in any way or used commercially without permission from the journal. |
spellingShingle | Observational Study Huespe, Ivan A. Echeverri, Jorge Khalid, Aisha Carboni Bisso, Indalecio Musso, Carlos G. Surani, Salim Bansal, Vikas Kashyap, Rahul Clinical Research With Large Language Models Generated Writing—Clinical Research with AI-assisted Writing (CRAW) Study |
title | Clinical Research With Large Language Models Generated Writing—Clinical Research with AI-assisted Writing (CRAW) Study |
title_full | Clinical Research With Large Language Models Generated Writing—Clinical Research with AI-assisted Writing (CRAW) Study |
title_fullStr | Clinical Research With Large Language Models Generated Writing—Clinical Research with AI-assisted Writing (CRAW) Study |
title_full_unstemmed | Clinical Research With Large Language Models Generated Writing—Clinical Research with AI-assisted Writing (CRAW) Study |
title_short | Clinical Research With Large Language Models Generated Writing—Clinical Research with AI-assisted Writing (CRAW) Study |
title_sort | clinical research with large language models generated writing—clinical research with ai-assisted writing (craw) study |
topic | Observational Study |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10547240/ https://www.ncbi.nlm.nih.gov/pubmed/37795455 http://dx.doi.org/10.1097/CCE.0000000000000975 |
work_keys_str_mv | AT huespeivana clinicalresearchwithlargelanguagemodelsgeneratedwritingclinicalresearchwithaiassistedwritingcrawstudy AT echeverrijorge clinicalresearchwithlargelanguagemodelsgeneratedwritingclinicalresearchwithaiassistedwritingcrawstudy AT khalidaisha clinicalresearchwithlargelanguagemodelsgeneratedwritingclinicalresearchwithaiassistedwritingcrawstudy AT carbonibissoindalecio clinicalresearchwithlargelanguagemodelsgeneratedwritingclinicalresearchwithaiassistedwritingcrawstudy AT mussocarlosg clinicalresearchwithlargelanguagemodelsgeneratedwritingclinicalresearchwithaiassistedwritingcrawstudy AT suranisalim clinicalresearchwithlargelanguagemodelsgeneratedwritingclinicalresearchwithaiassistedwritingcrawstudy AT bansalvikas clinicalresearchwithlargelanguagemodelsgeneratedwritingclinicalresearchwithaiassistedwritingcrawstudy AT kashyaprahul clinicalresearchwithlargelanguagemodelsgeneratedwritingclinicalresearchwithaiassistedwritingcrawstudy |