Cargando…

Bioformer: an efficient transformer language model for biomedical text mining

Pretrained language models such as Bidirectional Encoder Representations from Transformers (BERT) have achieved state-of-the-art performance in natural language processing (NLP) tasks. Recently, BERT has been adapted to the biomedical domain. Despite the effectiveness, these models have hundreds of...

Descripción completa

Detalles Bibliográficos
Autores principales:	Fang, Li, Chen, Qingyu, Wei, Chih-Hsuan, Lu, Zhiyong, Wang, Kai
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Cornell University 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10029052/ https://www.ncbi.nlm.nih.gov/pubmed/36945685

_version_	1784910066988613632
author	Fang, Li Chen, Qingyu Wei, Chih-Hsuan Lu, Zhiyong Wang, Kai
author_facet	Fang, Li Chen, Qingyu Wei, Chih-Hsuan Lu, Zhiyong Wang, Kai
author_sort	Fang, Li
collection	PubMed
description	Pretrained language models such as Bidirectional Encoder Representations from Transformers (BERT) have achieved state-of-the-art performance in natural language processing (NLP) tasks. Recently, BERT has been adapted to the biomedical domain. Despite the effectiveness, these models have hundreds of millions of parameters and are computationally expensive when applied to large-scale NLP applications. We hypothesized that the number of parameters of the original BERT can be dramatically reduced with minor impact on performance. In this study, we present Bioformer, a compact BERT model for biomedical text mining. We pretrained two Bioformer models (named Bioformer(8L) and Bioformer(16L)) which reduced the model size by 60% compared to BERT(Base). Bioformer uses a biomedical vocabulary and was pre-trained from scratch on PubMed abstracts and PubMed Central full-text articles. We thoroughly evaluated the performance of Bioformer as well as existing biomedical BERT models including BioBERT and PubMedBERT on 15 benchmark datasets of four different biomedical NLP tasks: named entity recognition, relation extraction, question answering and document classification. The results show that with 60% fewer parameters, Bioformer(16L) is only 0.1% less accurate than PubMedBERT while Bioformer(8L) is 0.9% less accurate than PubMedBERT. Both Bioformer(16L) and Bioformer(8L) outperformed BioBERT(Base-v1.1). In addition, Bioformer(16L) and Bioformer(8L) are 2–3 fold as fast as PubMedBERT/BioBERT(Base-v1.1). Bioformer has been successfully deployed to PubTator Central providing gene annotations over 35 million PubMed abstracts and 5 million PubMed Central full-text articles. We make Bioformer publicly available via https://github.com/WGLab/bioformer, including pre-trained models, datasets, and instructions for downstream use.
format	Online Article Text
id	pubmed-10029052
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Cornell University
record_format	MEDLINE/PubMed
spelling	pubmed-100290522023-03-22 Bioformer: an efficient transformer language model for biomedical text mining Fang, Li Chen, Qingyu Wei, Chih-Hsuan Lu, Zhiyong Wang, Kai ArXiv Article Pretrained language models such as Bidirectional Encoder Representations from Transformers (BERT) have achieved state-of-the-art performance in natural language processing (NLP) tasks. Recently, BERT has been adapted to the biomedical domain. Despite the effectiveness, these models have hundreds of millions of parameters and are computationally expensive when applied to large-scale NLP applications. We hypothesized that the number of parameters of the original BERT can be dramatically reduced with minor impact on performance. In this study, we present Bioformer, a compact BERT model for biomedical text mining. We pretrained two Bioformer models (named Bioformer(8L) and Bioformer(16L)) which reduced the model size by 60% compared to BERT(Base). Bioformer uses a biomedical vocabulary and was pre-trained from scratch on PubMed abstracts and PubMed Central full-text articles. We thoroughly evaluated the performance of Bioformer as well as existing biomedical BERT models including BioBERT and PubMedBERT on 15 benchmark datasets of four different biomedical NLP tasks: named entity recognition, relation extraction, question answering and document classification. The results show that with 60% fewer parameters, Bioformer(16L) is only 0.1% less accurate than PubMedBERT while Bioformer(8L) is 0.9% less accurate than PubMedBERT. Both Bioformer(16L) and Bioformer(8L) outperformed BioBERT(Base-v1.1). In addition, Bioformer(16L) and Bioformer(8L) are 2–3 fold as fast as PubMedBERT/BioBERT(Base-v1.1). Bioformer has been successfully deployed to PubTator Central providing gene annotations over 35 million PubMed abstracts and 5 million PubMed Central full-text articles. We make Bioformer publicly available via https://github.com/WGLab/bioformer, including pre-trained models, datasets, and instructions for downstream use. Cornell University 2023-02-03 /pmc/articles/PMC10029052/ /pubmed/36945685 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use.
spellingShingle	Article Fang, Li Chen, Qingyu Wei, Chih-Hsuan Lu, Zhiyong Wang, Kai Bioformer: an efficient transformer language model for biomedical text mining
title	Bioformer: an efficient transformer language model for biomedical text mining
title_full	Bioformer: an efficient transformer language model for biomedical text mining
title_fullStr	Bioformer: an efficient transformer language model for biomedical text mining
title_full_unstemmed	Bioformer: an efficient transformer language model for biomedical text mining
title_short	Bioformer: an efficient transformer language model for biomedical text mining
title_sort	bioformer: an efficient transformer language model for biomedical text mining
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10029052/ https://www.ncbi.nlm.nih.gov/pubmed/36945685
work_keys_str_mv	AT fangli bioformeranefficienttransformerlanguagemodelforbiomedicaltextmining AT chenqingyu bioformeranefficienttransformerlanguagemodelforbiomedicaltextmining AT weichihhsuan bioformeranefficienttransformerlanguagemodelforbiomedicaltextmining AT luzhiyong bioformeranefficienttransformerlanguagemodelforbiomedicaltextmining AT wangkai bioformeranefficienttransformerlanguagemodelforbiomedicaltextmining

Bioformer: an efficient transformer language model for biomedical text mining

Ejemplares similares