Cargando…

Distinguishing academic science writing from humans or ChatGPT with over 99% accuracy using off-the-shelf machine learning tools

ChatGPT has enabled access to artificial intelligence (AI)-generated writing for the masses, initiating a culture shift in the way people work, learn, and write. The need to discriminate human writing from AI is now both critical and urgent. Addressing this need, we report a method for discriminatin...

Descripción completa

Detalles Bibliográficos
Autores principales: Desaire, Heather, Chua, Aleesa E., Isom, Madeline, Jarosova, Romana, Hua, David
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10328544/
https://www.ncbi.nlm.nih.gov/pubmed/37426542
http://dx.doi.org/10.1016/j.xcrp.2023.101426
_version_ 1785069821992370176
author Desaire, Heather
Chua, Aleesa E.
Isom, Madeline
Jarosova, Romana
Hua, David
author_facet Desaire, Heather
Chua, Aleesa E.
Isom, Madeline
Jarosova, Romana
Hua, David
author_sort Desaire, Heather
collection PubMed
description ChatGPT has enabled access to artificial intelligence (AI)-generated writing for the masses, initiating a culture shift in the way people work, learn, and write. The need to discriminate human writing from AI is now both critical and urgent. Addressing this need, we report a method for discriminating text generated by ChatGPT from (human) academic scientists, relying on prevalent and accessible supervised classification methods. The approach uses new features for discriminating (these) humans from AI; as examples, scientists write long paragraphs and have a penchant for equivocal language, frequently using words like “but,” “however,” and “although.” With a set of 20 features, we built a model that assigns the author, as human or AI, at over 99% accuracy. This strategy could be further adapted and developed by others with basic skills in supervised classification, enabling access to many highly accurate and targeted models for detecting AI usage in academic writing and beyond.
format Online
Article
Text
id pubmed-10328544
institution National Center for Biotechnology Information
language English
publishDate 2023
record_format MEDLINE/PubMed
spelling pubmed-103285442023-07-07 Distinguishing academic science writing from humans or ChatGPT with over 99% accuracy using off-the-shelf machine learning tools Desaire, Heather Chua, Aleesa E. Isom, Madeline Jarosova, Romana Hua, David Cell Rep Phys Sci Article ChatGPT has enabled access to artificial intelligence (AI)-generated writing for the masses, initiating a culture shift in the way people work, learn, and write. The need to discriminate human writing from AI is now both critical and urgent. Addressing this need, we report a method for discriminating text generated by ChatGPT from (human) academic scientists, relying on prevalent and accessible supervised classification methods. The approach uses new features for discriminating (these) humans from AI; as examples, scientists write long paragraphs and have a penchant for equivocal language, frequently using words like “but,” “however,” and “although.” With a set of 20 features, we built a model that assigns the author, as human or AI, at over 99% accuracy. This strategy could be further adapted and developed by others with basic skills in supervised classification, enabling access to many highly accurate and targeted models for detecting AI usage in academic writing and beyond. 2023-06-21 2023-06-07 /pmc/articles/PMC10328544/ /pubmed/37426542 http://dx.doi.org/10.1016/j.xcrp.2023.101426 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/ (https://creativecommons.org/licenses/by-nc-nd/4.0/) ).
spellingShingle Article
Desaire, Heather
Chua, Aleesa E.
Isom, Madeline
Jarosova, Romana
Hua, David
Distinguishing academic science writing from humans or ChatGPT with over 99% accuracy using off-the-shelf machine learning tools
title Distinguishing academic science writing from humans or ChatGPT with over 99% accuracy using off-the-shelf machine learning tools
title_full Distinguishing academic science writing from humans or ChatGPT with over 99% accuracy using off-the-shelf machine learning tools
title_fullStr Distinguishing academic science writing from humans or ChatGPT with over 99% accuracy using off-the-shelf machine learning tools
title_full_unstemmed Distinguishing academic science writing from humans or ChatGPT with over 99% accuracy using off-the-shelf machine learning tools
title_short Distinguishing academic science writing from humans or ChatGPT with over 99% accuracy using off-the-shelf machine learning tools
title_sort distinguishing academic science writing from humans or chatgpt with over 99% accuracy using off-the-shelf machine learning tools
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10328544/
https://www.ncbi.nlm.nih.gov/pubmed/37426542
http://dx.doi.org/10.1016/j.xcrp.2023.101426
work_keys_str_mv AT desaireheather distinguishingacademicsciencewritingfromhumansorchatgptwithover99accuracyusingofftheshelfmachinelearningtools
AT chuaaleesae distinguishingacademicsciencewritingfromhumansorchatgptwithover99accuracyusingofftheshelfmachinelearningtools
AT isommadeline distinguishingacademicsciencewritingfromhumansorchatgptwithover99accuracyusingofftheshelfmachinelearningtools
AT jarosovaromana distinguishingacademicsciencewritingfromhumansorchatgptwithover99accuracyusingofftheshelfmachinelearningtools
AT huadavid distinguishingacademicsciencewritingfromhumansorchatgptwithover99accuracyusingofftheshelfmachinelearningtools