Cargando…

An extensive benchmark study on biomedical text generation and mining with ChatGPT

MOTIVATION: In recent years, the development of natural language process (NLP) technologies and deep learning hardware has led to significant improvement in large language models (LLMs). The ChatGPT, the state-of-the-art LLM built on GPT-3.5 and GPT-4, shows excellent capabilities in general languag...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Qijie, Sun, Haotong, Liu, Haoyang, Jiang, Yinghui, Ran, Ting, Jin, Xurui, Xiao, Xianglu, Lin, Zhimin, Chen, Hongming, Niu, Zhangmin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10562950/
https://www.ncbi.nlm.nih.gov/pubmed/37682111
http://dx.doi.org/10.1093/bioinformatics/btad557
_version_ 1785118241963638784
author Chen, Qijie
Sun, Haotong
Liu, Haoyang
Jiang, Yinghui
Ran, Ting
Jin, Xurui
Xiao, Xianglu
Lin, Zhimin
Chen, Hongming
Niu, Zhangmin
author_facet Chen, Qijie
Sun, Haotong
Liu, Haoyang
Jiang, Yinghui
Ran, Ting
Jin, Xurui
Xiao, Xianglu
Lin, Zhimin
Chen, Hongming
Niu, Zhangmin
author_sort Chen, Qijie
collection PubMed
description MOTIVATION: In recent years, the development of natural language process (NLP) technologies and deep learning hardware has led to significant improvement in large language models (LLMs). The ChatGPT, the state-of-the-art LLM built on GPT-3.5 and GPT-4, shows excellent capabilities in general language understanding and reasoning. Researchers also tested the GPTs on a variety of NLP-related tasks and benchmarks and got excellent results. With exciting performance on daily chat, researchers began to explore the capacity of ChatGPT on expertise that requires professional education for human and we are interested in the biomedical domain. RESULTS: To evaluate the performance of ChatGPT on biomedical-related tasks, this article presents a comprehensive benchmark study on the use of ChatGPT for biomedical corpus, including article abstracts, clinical trials description, biomedical questions, and so on. Typical NLP tasks like named entity recognization, relation extraction, sentence similarity, question and answering, and document classification are included. Overall, ChatGPT got a BLURB score of 58.50 while the state-of-the-art model had a score of 84.30. Through a series of experiments, we demonstrated the effectiveness and versatility of ChatGPT in biomedical text understanding, reasoning and generation, and the limitation of ChatGPT build on GPT-3.5. AVAILABILITY AND IMPLEMENTATION: All the datasets are available from BLURB benchmark https://microsoft.github.io/BLURB/index.html. The prompts are described in the article.
format Online
Article
Text
id pubmed-10562950
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-105629502023-10-11 An extensive benchmark study on biomedical text generation and mining with ChatGPT Chen, Qijie Sun, Haotong Liu, Haoyang Jiang, Yinghui Ran, Ting Jin, Xurui Xiao, Xianglu Lin, Zhimin Chen, Hongming Niu, Zhangmin Bioinformatics Original Paper MOTIVATION: In recent years, the development of natural language process (NLP) technologies and deep learning hardware has led to significant improvement in large language models (LLMs). The ChatGPT, the state-of-the-art LLM built on GPT-3.5 and GPT-4, shows excellent capabilities in general language understanding and reasoning. Researchers also tested the GPTs on a variety of NLP-related tasks and benchmarks and got excellent results. With exciting performance on daily chat, researchers began to explore the capacity of ChatGPT on expertise that requires professional education for human and we are interested in the biomedical domain. RESULTS: To evaluate the performance of ChatGPT on biomedical-related tasks, this article presents a comprehensive benchmark study on the use of ChatGPT for biomedical corpus, including article abstracts, clinical trials description, biomedical questions, and so on. Typical NLP tasks like named entity recognization, relation extraction, sentence similarity, question and answering, and document classification are included. Overall, ChatGPT got a BLURB score of 58.50 while the state-of-the-art model had a score of 84.30. Through a series of experiments, we demonstrated the effectiveness and versatility of ChatGPT in biomedical text understanding, reasoning and generation, and the limitation of ChatGPT build on GPT-3.5. AVAILABILITY AND IMPLEMENTATION: All the datasets are available from BLURB benchmark https://microsoft.github.io/BLURB/index.html. The prompts are described in the article. Oxford University Press 2023-09-08 /pmc/articles/PMC10562950/ /pubmed/37682111 http://dx.doi.org/10.1093/bioinformatics/btad557 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Chen, Qijie
Sun, Haotong
Liu, Haoyang
Jiang, Yinghui
Ran, Ting
Jin, Xurui
Xiao, Xianglu
Lin, Zhimin
Chen, Hongming
Niu, Zhangmin
An extensive benchmark study on biomedical text generation and mining with ChatGPT
title An extensive benchmark study on biomedical text generation and mining with ChatGPT
title_full An extensive benchmark study on biomedical text generation and mining with ChatGPT
title_fullStr An extensive benchmark study on biomedical text generation and mining with ChatGPT
title_full_unstemmed An extensive benchmark study on biomedical text generation and mining with ChatGPT
title_short An extensive benchmark study on biomedical text generation and mining with ChatGPT
title_sort extensive benchmark study on biomedical text generation and mining with chatgpt
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10562950/
https://www.ncbi.nlm.nih.gov/pubmed/37682111
http://dx.doi.org/10.1093/bioinformatics/btad557
work_keys_str_mv AT chenqijie anextensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt
AT sunhaotong anextensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt
AT liuhaoyang anextensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt
AT jiangyinghui anextensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt
AT ranting anextensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt
AT jinxurui anextensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt
AT xiaoxianglu anextensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt
AT linzhimin anextensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt
AT chenhongming anextensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt
AT niuzhangmin anextensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt
AT chenqijie extensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt
AT sunhaotong extensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt
AT liuhaoyang extensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt
AT jiangyinghui extensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt
AT ranting extensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt
AT jinxurui extensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt
AT xiaoxianglu extensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt
AT linzhimin extensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt
AT chenhongming extensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt
AT niuzhangmin extensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt