Cargando…

Medical text classification based on the discriminative pre-training model and prompt-tuning

Medical text classification, as a fundamental medical natural language processing task, aims to identify the categories to which a short medical text belongs. Current research has focused on performing the medical text classification task using a pre-training language model through fine-tuning. Howe...

Descripción completa

Detalles Bibliográficos
Autores principales:	Wang, Yu, Wang, Yuan, Peng, Zhenwan, Zhang, Feifan, Zhou, Luyao, Yang, Fei
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	SAGE Publications 2023
Materias:	Original Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10408339/ https://www.ncbi.nlm.nih.gov/pubmed/37559830 http://dx.doi.org/10.1177/20552076231193213

_version_	1785086165299232768
author	Wang, Yu Wang, Yuan Peng, Zhenwan Zhang, Feifan Zhou, Luyao Yang, Fei
author_facet	Wang, Yu Wang, Yuan Peng, Zhenwan Zhang, Feifan Zhou, Luyao Yang, Fei
author_sort	Wang, Yu
collection	PubMed
description	Medical text classification, as a fundamental medical natural language processing task, aims to identify the categories to which a short medical text belongs. Current research has focused on performing the medical text classification task using a pre-training language model through fine-tuning. However, this paradigm introduces additional parameters when training extra classifiers. Recent studies have shown that the “prompt-tuning” paradigm induces better performance in many natural language processing tasks because it bridges the gap between pre-training goals and downstream tasks. The main idea of prompt-tuning is to transform binary or multi-classification tasks into mask prediction tasks by fully exploiting the features learned by pre-training language models. This study explores, for the first time, how to classify medical texts using a discriminative pre-training language model called ERNIE-Health through prompt-tuning. Specifically, we attempt to perform prompt-tuning based on the multi-token selection task, which is a pre-training task of ERNIE-Health. The raw text is wrapped into a new sequence with a template in which the category label is replaced by a [UNK] token. The model is then trained to calculate the probability distribution of the candidate categories. Our method is tested on the KUAKE-Question Intention Classification and CHiP-Clinical Trial Criterion datasets and obtains the accuracy values of 0.866 and 0.861. In addition, the loss values of our model decrease faster throughout the training period compared to the fine-tuning. The experimental results provide valuable insights to the community and suggest that prompt-tuning can be a promising approach to improve the performance of pre-training models in domain-specific tasks.
format	Online Article Text
id	pubmed-10408339
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	SAGE Publications
record_format	MEDLINE/PubMed
spelling	pubmed-104083392023-08-09 Medical text classification based on the discriminative pre-training model and prompt-tuning Wang, Yu Wang, Yuan Peng, Zhenwan Zhang, Feifan Zhou, Luyao Yang, Fei Digit Health Original Research Medical text classification, as a fundamental medical natural language processing task, aims to identify the categories to which a short medical text belongs. Current research has focused on performing the medical text classification task using a pre-training language model through fine-tuning. However, this paradigm introduces additional parameters when training extra classifiers. Recent studies have shown that the “prompt-tuning” paradigm induces better performance in many natural language processing tasks because it bridges the gap between pre-training goals and downstream tasks. The main idea of prompt-tuning is to transform binary or multi-classification tasks into mask prediction tasks by fully exploiting the features learned by pre-training language models. This study explores, for the first time, how to classify medical texts using a discriminative pre-training language model called ERNIE-Health through prompt-tuning. Specifically, we attempt to perform prompt-tuning based on the multi-token selection task, which is a pre-training task of ERNIE-Health. The raw text is wrapped into a new sequence with a template in which the category label is replaced by a [UNK] token. The model is then trained to calculate the probability distribution of the candidate categories. Our method is tested on the KUAKE-Question Intention Classification and CHiP-Clinical Trial Criterion datasets and obtains the accuracy values of 0.866 and 0.861. In addition, the loss values of our model decrease faster throughout the training period compared to the fine-tuning. The experimental results provide valuable insights to the community and suggest that prompt-tuning can be a promising approach to improve the performance of pre-training models in domain-specific tasks. SAGE Publications 2023-08-06 /pmc/articles/PMC10408339/ /pubmed/37559830 http://dx.doi.org/10.1177/20552076231193213 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by-nc-nd/4.0/This article is distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 License (https://creativecommons.org/licenses/by-nc-nd/4.0/) which permits non-commercial use, reproduction and distribution of the work as published without adaptation or alteration, without further permission provided the original work is attributed as specified on the SAGE and Open Access page (https://us.sagepub.com/en-us/nam/open-access-at-sage).
spellingShingle	Original Research Wang, Yu Wang, Yuan Peng, Zhenwan Zhang, Feifan Zhou, Luyao Yang, Fei Medical text classification based on the discriminative pre-training model and prompt-tuning
title	Medical text classification based on the discriminative pre-training model and prompt-tuning
title_full	Medical text classification based on the discriminative pre-training model and prompt-tuning
title_fullStr	Medical text classification based on the discriminative pre-training model and prompt-tuning
title_full_unstemmed	Medical text classification based on the discriminative pre-training model and prompt-tuning
title_short	Medical text classification based on the discriminative pre-training model and prompt-tuning
title_sort	medical text classification based on the discriminative pre-training model and prompt-tuning
topic	Original Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10408339/ https://www.ncbi.nlm.nih.gov/pubmed/37559830 http://dx.doi.org/10.1177/20552076231193213
work_keys_str_mv	AT wangyu medicaltextclassificationbasedonthediscriminativepretrainingmodelandprompttuning AT wangyuan medicaltextclassificationbasedonthediscriminativepretrainingmodelandprompttuning AT pengzhenwan medicaltextclassificationbasedonthediscriminativepretrainingmodelandprompttuning AT zhangfeifan medicaltextclassificationbasedonthediscriminativepretrainingmodelandprompttuning AT zhouluyao medicaltextclassificationbasedonthediscriminativepretrainingmodelandprompttuning AT yangfei medicaltextclassificationbasedonthediscriminativepretrainingmodelandprompttuning

Medical text classification based on the discriminative pre-training model and prompt-tuning

Ejemplares similares