Cargando…

Perception, performance, and detectability of conversational artificial intelligence across 32 university courses

The emergence of large language models has led to the development of powerful tools such as ChatGPT that can produce text indistinguishable from human-generated work. With the increasing accessibility of such technology, students across the globe may utilize it to help with their school work—a possi...

Descripción completa

Detalles Bibliográficos
Autores principales: Ibrahim, Hazem, Liu, Fengyuan, Asim, Rohail, Battu, Balaraju, Benabderrahmane, Sidahmed, Alhafni, Bashar, Adnan, Wifag, Alhanai, Tuka, AlShebli, Bedoor, Baghdadi, Riyadh, Bélanger, Jocelyn J., Beretta, Elena, Celik, Kemal, Chaqfeh, Moumena, Daqaq, Mohammed F., Bernoussi, Zaynab El, Fougnie, Daryl, Garcia de Soto, Borja, Gandolfi, Alberto, Gyorgy, Andras, Habash, Nizar, Harris, J. Andrew, Kaufman, Aaron, Kirousis, Lefteris, Kocak, Korhan, Lee, Kangsan, Lee, Seungah S., Malik, Samreen, Maniatakos, Michail, Melcher, David, Mourad, Azzam, Park, Minsu, Rasras, Mahmoud, Reuben, Alicja, Zantout, Dania, Gleason, Nancy W., Makovi, Kinga, Rahwan, Talal, Zaki, Yasir
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10449897/
https://www.ncbi.nlm.nih.gov/pubmed/37620342
http://dx.doi.org/10.1038/s41598-023-38964-3
_version_ 1785095067083472896
author Ibrahim, Hazem
Liu, Fengyuan
Asim, Rohail
Battu, Balaraju
Benabderrahmane, Sidahmed
Alhafni, Bashar
Adnan, Wifag
Alhanai, Tuka
AlShebli, Bedoor
Baghdadi, Riyadh
Bélanger, Jocelyn J.
Beretta, Elena
Celik, Kemal
Chaqfeh, Moumena
Daqaq, Mohammed F.
Bernoussi, Zaynab El
Fougnie, Daryl
Garcia de Soto, Borja
Gandolfi, Alberto
Gyorgy, Andras
Habash, Nizar
Harris, J. Andrew
Kaufman, Aaron
Kirousis, Lefteris
Kocak, Korhan
Lee, Kangsan
Lee, Seungah S.
Malik, Samreen
Maniatakos, Michail
Melcher, David
Mourad, Azzam
Park, Minsu
Rasras, Mahmoud
Reuben, Alicja
Zantout, Dania
Gleason, Nancy W.
Makovi, Kinga
Rahwan, Talal
Zaki, Yasir
author_facet Ibrahim, Hazem
Liu, Fengyuan
Asim, Rohail
Battu, Balaraju
Benabderrahmane, Sidahmed
Alhafni, Bashar
Adnan, Wifag
Alhanai, Tuka
AlShebli, Bedoor
Baghdadi, Riyadh
Bélanger, Jocelyn J.
Beretta, Elena
Celik, Kemal
Chaqfeh, Moumena
Daqaq, Mohammed F.
Bernoussi, Zaynab El
Fougnie, Daryl
Garcia de Soto, Borja
Gandolfi, Alberto
Gyorgy, Andras
Habash, Nizar
Harris, J. Andrew
Kaufman, Aaron
Kirousis, Lefteris
Kocak, Korhan
Lee, Kangsan
Lee, Seungah S.
Malik, Samreen
Maniatakos, Michail
Melcher, David
Mourad, Azzam
Park, Minsu
Rasras, Mahmoud
Reuben, Alicja
Zantout, Dania
Gleason, Nancy W.
Makovi, Kinga
Rahwan, Talal
Zaki, Yasir
author_sort Ibrahim, Hazem
collection PubMed
description The emergence of large language models has led to the development of powerful tools such as ChatGPT that can produce text indistinguishable from human-generated work. With the increasing accessibility of such technology, students across the globe may utilize it to help with their school work—a possibility that has sparked ample discussion on the integrity of student evaluation processes in the age of artificial intelligence (AI). To date, it is unclear how such tools perform compared to students on university-level courses across various disciplines. Further, students’ perspectives regarding the use of such tools in school work, and educators’ perspectives on treating their use as plagiarism, remain unknown. Here, we compare the performance of the state-of-the-art tool, ChatGPT, against that of students on 32 university-level courses. We also assess the degree to which its use can be detected by two classifiers designed specifically for this purpose. Additionally, we conduct a global survey across five countries, as well as a more in-depth survey at the authors’ institution, to discern students’ and educators’ perceptions of ChatGPT’s use in school work. We find that ChatGPT’s performance is comparable, if not superior, to that of students in a multitude of courses. Moreover, current AI-text classifiers cannot reliably detect ChatGPT’s use in school work, due to both their propensity to classify human-written answers as AI-generated, as well as the relative ease with which AI-generated text can be edited to evade detection. Finally, there seems to be an emerging consensus among students to use the tool, and among educators to treat its use as plagiarism. Our findings offer insights that could guide policy discussions addressing the integration of artificial intelligence into educational frameworks.
format Online
Article
Text
id pubmed-10449897
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-104498972023-08-26 Perception, performance, and detectability of conversational artificial intelligence across 32 university courses Ibrahim, Hazem Liu, Fengyuan Asim, Rohail Battu, Balaraju Benabderrahmane, Sidahmed Alhafni, Bashar Adnan, Wifag Alhanai, Tuka AlShebli, Bedoor Baghdadi, Riyadh Bélanger, Jocelyn J. Beretta, Elena Celik, Kemal Chaqfeh, Moumena Daqaq, Mohammed F. Bernoussi, Zaynab El Fougnie, Daryl Garcia de Soto, Borja Gandolfi, Alberto Gyorgy, Andras Habash, Nizar Harris, J. Andrew Kaufman, Aaron Kirousis, Lefteris Kocak, Korhan Lee, Kangsan Lee, Seungah S. Malik, Samreen Maniatakos, Michail Melcher, David Mourad, Azzam Park, Minsu Rasras, Mahmoud Reuben, Alicja Zantout, Dania Gleason, Nancy W. Makovi, Kinga Rahwan, Talal Zaki, Yasir Sci Rep Article The emergence of large language models has led to the development of powerful tools such as ChatGPT that can produce text indistinguishable from human-generated work. With the increasing accessibility of such technology, students across the globe may utilize it to help with their school work—a possibility that has sparked ample discussion on the integrity of student evaluation processes in the age of artificial intelligence (AI). To date, it is unclear how such tools perform compared to students on university-level courses across various disciplines. Further, students’ perspectives regarding the use of such tools in school work, and educators’ perspectives on treating their use as plagiarism, remain unknown. Here, we compare the performance of the state-of-the-art tool, ChatGPT, against that of students on 32 university-level courses. We also assess the degree to which its use can be detected by two classifiers designed specifically for this purpose. Additionally, we conduct a global survey across five countries, as well as a more in-depth survey at the authors’ institution, to discern students’ and educators’ perceptions of ChatGPT’s use in school work. We find that ChatGPT’s performance is comparable, if not superior, to that of students in a multitude of courses. Moreover, current AI-text classifiers cannot reliably detect ChatGPT’s use in school work, due to both their propensity to classify human-written answers as AI-generated, as well as the relative ease with which AI-generated text can be edited to evade detection. Finally, there seems to be an emerging consensus among students to use the tool, and among educators to treat its use as plagiarism. Our findings offer insights that could guide policy discussions addressing the integration of artificial intelligence into educational frameworks. Nature Publishing Group UK 2023-08-24 /pmc/articles/PMC10449897/ /pubmed/37620342 http://dx.doi.org/10.1038/s41598-023-38964-3 Text en © The Author(s) 2023, corrected publication 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Ibrahim, Hazem
Liu, Fengyuan
Asim, Rohail
Battu, Balaraju
Benabderrahmane, Sidahmed
Alhafni, Bashar
Adnan, Wifag
Alhanai, Tuka
AlShebli, Bedoor
Baghdadi, Riyadh
Bélanger, Jocelyn J.
Beretta, Elena
Celik, Kemal
Chaqfeh, Moumena
Daqaq, Mohammed F.
Bernoussi, Zaynab El
Fougnie, Daryl
Garcia de Soto, Borja
Gandolfi, Alberto
Gyorgy, Andras
Habash, Nizar
Harris, J. Andrew
Kaufman, Aaron
Kirousis, Lefteris
Kocak, Korhan
Lee, Kangsan
Lee, Seungah S.
Malik, Samreen
Maniatakos, Michail
Melcher, David
Mourad, Azzam
Park, Minsu
Rasras, Mahmoud
Reuben, Alicja
Zantout, Dania
Gleason, Nancy W.
Makovi, Kinga
Rahwan, Talal
Zaki, Yasir
Perception, performance, and detectability of conversational artificial intelligence across 32 university courses
title Perception, performance, and detectability of conversational artificial intelligence across 32 university courses
title_full Perception, performance, and detectability of conversational artificial intelligence across 32 university courses
title_fullStr Perception, performance, and detectability of conversational artificial intelligence across 32 university courses
title_full_unstemmed Perception, performance, and detectability of conversational artificial intelligence across 32 university courses
title_short Perception, performance, and detectability of conversational artificial intelligence across 32 university courses
title_sort perception, performance, and detectability of conversational artificial intelligence across 32 university courses
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10449897/
https://www.ncbi.nlm.nih.gov/pubmed/37620342
http://dx.doi.org/10.1038/s41598-023-38964-3
work_keys_str_mv AT ibrahimhazem perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT liufengyuan perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT asimrohail perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT battubalaraju perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT benabderrahmanesidahmed perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT alhafnibashar perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT adnanwifag perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT alhanaituka perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT alsheblibedoor perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT baghdadiriyadh perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT belangerjocelynj perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT berettaelena perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT celikkemal perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT chaqfehmoumena perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT daqaqmohammedf perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT bernoussizaynabel perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT fougniedaryl perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT garciadesotoborja perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT gandolfialberto perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT gyorgyandras perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT habashnizar perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT harrisjandrew perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT kaufmanaaron perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT kirousislefteris perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT kocakkorhan perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT leekangsan perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT leeseungahs perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT maliksamreen perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT maniatakosmichail perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT melcherdavid perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT mouradazzam perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT parkminsu perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT rasrasmahmoud perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT reubenalicja perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT zantoutdania perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT gleasonnancyw perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT makovikinga perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT rahwantalal perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses
AT zakiyasir perceptionperformanceanddetectabilityofconversationalartificialintelligenceacross32universitycourses