Cargando…
An extensive benchmark study on biomedical text generation and mining with ChatGPT
MOTIVATION: In recent years, the development of natural language process (NLP) technologies and deep learning hardware has led to significant improvement in large language models (LLMs). The ChatGPT, the state-of-the-art LLM built on GPT-3.5 and GPT-4, shows excellent capabilities in general languag...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10562950/ https://www.ncbi.nlm.nih.gov/pubmed/37682111 http://dx.doi.org/10.1093/bioinformatics/btad557 |
_version_ | 1785118241963638784 |
---|---|
author | Chen, Qijie Sun, Haotong Liu, Haoyang Jiang, Yinghui Ran, Ting Jin, Xurui Xiao, Xianglu Lin, Zhimin Chen, Hongming Niu, Zhangmin |
author_facet | Chen, Qijie Sun, Haotong Liu, Haoyang Jiang, Yinghui Ran, Ting Jin, Xurui Xiao, Xianglu Lin, Zhimin Chen, Hongming Niu, Zhangmin |
author_sort | Chen, Qijie |
collection | PubMed |
description | MOTIVATION: In recent years, the development of natural language process (NLP) technologies and deep learning hardware has led to significant improvement in large language models (LLMs). The ChatGPT, the state-of-the-art LLM built on GPT-3.5 and GPT-4, shows excellent capabilities in general language understanding and reasoning. Researchers also tested the GPTs on a variety of NLP-related tasks and benchmarks and got excellent results. With exciting performance on daily chat, researchers began to explore the capacity of ChatGPT on expertise that requires professional education for human and we are interested in the biomedical domain. RESULTS: To evaluate the performance of ChatGPT on biomedical-related tasks, this article presents a comprehensive benchmark study on the use of ChatGPT for biomedical corpus, including article abstracts, clinical trials description, biomedical questions, and so on. Typical NLP tasks like named entity recognization, relation extraction, sentence similarity, question and answering, and document classification are included. Overall, ChatGPT got a BLURB score of 58.50 while the state-of-the-art model had a score of 84.30. Through a series of experiments, we demonstrated the effectiveness and versatility of ChatGPT in biomedical text understanding, reasoning and generation, and the limitation of ChatGPT build on GPT-3.5. AVAILABILITY AND IMPLEMENTATION: All the datasets are available from BLURB benchmark https://microsoft.github.io/BLURB/index.html. The prompts are described in the article. |
format | Online Article Text |
id | pubmed-10562950 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-105629502023-10-11 An extensive benchmark study on biomedical text generation and mining with ChatGPT Chen, Qijie Sun, Haotong Liu, Haoyang Jiang, Yinghui Ran, Ting Jin, Xurui Xiao, Xianglu Lin, Zhimin Chen, Hongming Niu, Zhangmin Bioinformatics Original Paper MOTIVATION: In recent years, the development of natural language process (NLP) technologies and deep learning hardware has led to significant improvement in large language models (LLMs). The ChatGPT, the state-of-the-art LLM built on GPT-3.5 and GPT-4, shows excellent capabilities in general language understanding and reasoning. Researchers also tested the GPTs on a variety of NLP-related tasks and benchmarks and got excellent results. With exciting performance on daily chat, researchers began to explore the capacity of ChatGPT on expertise that requires professional education for human and we are interested in the biomedical domain. RESULTS: To evaluate the performance of ChatGPT on biomedical-related tasks, this article presents a comprehensive benchmark study on the use of ChatGPT for biomedical corpus, including article abstracts, clinical trials description, biomedical questions, and so on. Typical NLP tasks like named entity recognization, relation extraction, sentence similarity, question and answering, and document classification are included. Overall, ChatGPT got a BLURB score of 58.50 while the state-of-the-art model had a score of 84.30. Through a series of experiments, we demonstrated the effectiveness and versatility of ChatGPT in biomedical text understanding, reasoning and generation, and the limitation of ChatGPT build on GPT-3.5. AVAILABILITY AND IMPLEMENTATION: All the datasets are available from BLURB benchmark https://microsoft.github.io/BLURB/index.html. The prompts are described in the article. Oxford University Press 2023-09-08 /pmc/articles/PMC10562950/ /pubmed/37682111 http://dx.doi.org/10.1093/bioinformatics/btad557 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Paper Chen, Qijie Sun, Haotong Liu, Haoyang Jiang, Yinghui Ran, Ting Jin, Xurui Xiao, Xianglu Lin, Zhimin Chen, Hongming Niu, Zhangmin An extensive benchmark study on biomedical text generation and mining with ChatGPT |
title | An extensive benchmark study on biomedical text generation and mining with ChatGPT |
title_full | An extensive benchmark study on biomedical text generation and mining with ChatGPT |
title_fullStr | An extensive benchmark study on biomedical text generation and mining with ChatGPT |
title_full_unstemmed | An extensive benchmark study on biomedical text generation and mining with ChatGPT |
title_short | An extensive benchmark study on biomedical text generation and mining with ChatGPT |
title_sort | extensive benchmark study on biomedical text generation and mining with chatgpt |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10562950/ https://www.ncbi.nlm.nih.gov/pubmed/37682111 http://dx.doi.org/10.1093/bioinformatics/btad557 |
work_keys_str_mv | AT chenqijie anextensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt AT sunhaotong anextensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt AT liuhaoyang anextensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt AT jiangyinghui anextensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt AT ranting anextensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt AT jinxurui anextensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt AT xiaoxianglu anextensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt AT linzhimin anextensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt AT chenhongming anextensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt AT niuzhangmin anextensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt AT chenqijie extensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt AT sunhaotong extensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt AT liuhaoyang extensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt AT jiangyinghui extensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt AT ranting extensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt AT jinxurui extensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt AT xiaoxianglu extensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt AT linzhimin extensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt AT chenhongming extensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt AT niuzhangmin extensivebenchmarkstudyonbiomedicaltextgenerationandminingwithchatgpt |