Cargando…

On the effectiveness of compact biomedical transformers

MOTIVATION: Language models pre-trained on biomedical corpora, such as BioBERT, have recently shown promising results on downstream biomedical tasks. Many existing pre-trained models, on the other hand, are resource-intensive and computationally heavy owing to factors such as embedding size, hidden...

Descripción completa

Detalles Bibliográficos
Autores principales:	Rohanian, Omid, Nouriborji, Mohammadmahdi, Kouchaki, Samaneh, Clifton, David A
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2023
Materias:	Original Paper
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10027428/ https://www.ncbi.nlm.nih.gov/pubmed/36825820 http://dx.doi.org/10.1093/bioinformatics/btad103

_version_	1784909710068023296
author	Rohanian, Omid Nouriborji, Mohammadmahdi Kouchaki, Samaneh Clifton, David A
author_facet	Rohanian, Omid Nouriborji, Mohammadmahdi Kouchaki, Samaneh Clifton, David A
author_sort	Rohanian, Omid
collection	PubMed
description	MOTIVATION: Language models pre-trained on biomedical corpora, such as BioBERT, have recently shown promising results on downstream biomedical tasks. Many existing pre-trained models, on the other hand, are resource-intensive and computationally heavy owing to factors such as embedding size, hidden dimension and number of layers. The natural language processing community has developed numerous strategies to compress these models utilizing techniques such as pruning, quantization and knowledge distillation, resulting in models that are considerably faster, smaller and subsequently easier to use in practice. By the same token, in this article, we introduce six lightweight models, namely, BioDistilBERT, BioTinyBERT, BioMobileBERT, DistilBioBERT, TinyBioBERT and CompactBioBERT which are obtained either by knowledge distillation from a biomedical teacher or continual learning on the Pubmed dataset. We evaluate all of our models on three biomedical tasks and compare them with BioBERT-v1.1 to create the best efficient lightweight models that perform on par with their larger counterparts. RESULTS: We trained six different models in total, with the largest model having 65 million in parameters and the smallest having 15 million; a far lower range of parameters compared with BioBERT’s 110M. Based on our experiments on three different biomedical tasks, we found that models distilled from a biomedical teacher and models that have been additionally pre-trained on the PubMed dataset can retain up to 98.8% and 98.6% of the performance of the BioBERT-v1.1, respectively. Overall, our best model below 30 M parameters is BioMobileBERT, while our best models over 30 M parameters are DistilBioBERT and CompactBioBERT, which can keep up to 98.2% and 98.8% of the performance of the BioBERT-v1.1, respectively. AVAILABILITY AND IMPLEMENTATION: Codes are available at: https://github.com/nlpie-research/Compact-Biomedical-Transformers. Trained models can be accessed at: https://huggingface.co/nlpie.
format	Online Article Text
id	pubmed-10027428
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-100274282023-03-21 On the effectiveness of compact biomedical transformers Rohanian, Omid Nouriborji, Mohammadmahdi Kouchaki, Samaneh Clifton, David A Bioinformatics Original Paper MOTIVATION: Language models pre-trained on biomedical corpora, such as BioBERT, have recently shown promising results on downstream biomedical tasks. Many existing pre-trained models, on the other hand, are resource-intensive and computationally heavy owing to factors such as embedding size, hidden dimension and number of layers. The natural language processing community has developed numerous strategies to compress these models utilizing techniques such as pruning, quantization and knowledge distillation, resulting in models that are considerably faster, smaller and subsequently easier to use in practice. By the same token, in this article, we introduce six lightweight models, namely, BioDistilBERT, BioTinyBERT, BioMobileBERT, DistilBioBERT, TinyBioBERT and CompactBioBERT which are obtained either by knowledge distillation from a biomedical teacher or continual learning on the Pubmed dataset. We evaluate all of our models on three biomedical tasks and compare them with BioBERT-v1.1 to create the best efficient lightweight models that perform on par with their larger counterparts. RESULTS: We trained six different models in total, with the largest model having 65 million in parameters and the smallest having 15 million; a far lower range of parameters compared with BioBERT’s 110M. Based on our experiments on three different biomedical tasks, we found that models distilled from a biomedical teacher and models that have been additionally pre-trained on the PubMed dataset can retain up to 98.8% and 98.6% of the performance of the BioBERT-v1.1, respectively. Overall, our best model below 30 M parameters is BioMobileBERT, while our best models over 30 M parameters are DistilBioBERT and CompactBioBERT, which can keep up to 98.2% and 98.8% of the performance of the BioBERT-v1.1, respectively. AVAILABILITY AND IMPLEMENTATION: Codes are available at: https://github.com/nlpie-research/Compact-Biomedical-Transformers. Trained models can be accessed at: https://huggingface.co/nlpie. Oxford University Press 2023-02-24 /pmc/articles/PMC10027428/ /pubmed/36825820 http://dx.doi.org/10.1093/bioinformatics/btad103 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Original Paper Rohanian, Omid Nouriborji, Mohammadmahdi Kouchaki, Samaneh Clifton, David A On the effectiveness of compact biomedical transformers
title	On the effectiveness of compact biomedical transformers
title_full	On the effectiveness of compact biomedical transformers
title_fullStr	On the effectiveness of compact biomedical transformers
title_full_unstemmed	On the effectiveness of compact biomedical transformers
title_short	On the effectiveness of compact biomedical transformers
title_sort	on the effectiveness of compact biomedical transformers
topic	Original Paper
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10027428/ https://www.ncbi.nlm.nih.gov/pubmed/36825820 http://dx.doi.org/10.1093/bioinformatics/btad103
work_keys_str_mv	AT rohanianomid ontheeffectivenessofcompactbiomedicaltransformers AT nouriborjimohammadmahdi ontheeffectivenessofcompactbiomedicaltransformers AT kouchakisamaneh ontheeffectivenessofcompactbiomedicaltransformers AT cliftondavida ontheeffectivenessofcompactbiomedicaltransformers

On the effectiveness of compact biomedical transformers

Ejemplares similares