Cargando…

Fine-tuning large neural language models for biomedical natural language processing

Large neural language models have transformed modern natural language processing (NLP) applications. However, fine-tuning such models for specific tasks remains challenging as model size increases, especially with small labeled datasets, which are common in biomedical NLP. We conduct a systematic st...

Descripción completa

Detalles Bibliográficos
Autores principales: Tinn, Robert, Cheng, Hao, Gu, Yu, Usuyama, Naoto, Liu, Xiaodong, Naumann, Tristan, Gao, Jianfeng, Poon, Hoifung
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10140607/
https://www.ncbi.nlm.nih.gov/pubmed/37123444
http://dx.doi.org/10.1016/j.patter.2023.100729
_version_ 1785033199949185024
author Tinn, Robert
Cheng, Hao
Gu, Yu
Usuyama, Naoto
Liu, Xiaodong
Naumann, Tristan
Gao, Jianfeng
Poon, Hoifung
author_facet Tinn, Robert
Cheng, Hao
Gu, Yu
Usuyama, Naoto
Liu, Xiaodong
Naumann, Tristan
Gao, Jianfeng
Poon, Hoifung
author_sort Tinn, Robert
collection PubMed
description Large neural language models have transformed modern natural language processing (NLP) applications. However, fine-tuning such models for specific tasks remains challenging as model size increases, especially with small labeled datasets, which are common in biomedical NLP. We conduct a systematic study on fine-tuning stability in biomedical NLP. We show that fine-tuning performance may be sensitive to pretraining settings and conduct an exploration of techniques for addressing fine-tuning instability. We show that these techniques can substantially improve fine-tuning performance for low-resource biomedical NLP applications. Specifically, freezing lower layers is helpful for standard BERT- [Formula: see text] models, while layerwise decay is more effective for BERT- [Formula: see text] and ELECTRA models. For low-resource text similarity tasks, such as BIOSSES, reinitializing the top layers is the optimal strategy. Overall, domain-specific vocabulary and pretraining facilitate robust models for fine-tuning. Based on these findings, we establish a new state of the art on a wide range of biomedical NLP applications.
format Online
Article
Text
id pubmed-10140607
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-101406072023-04-29 Fine-tuning large neural language models for biomedical natural language processing Tinn, Robert Cheng, Hao Gu, Yu Usuyama, Naoto Liu, Xiaodong Naumann, Tristan Gao, Jianfeng Poon, Hoifung Patterns (N Y) Article Large neural language models have transformed modern natural language processing (NLP) applications. However, fine-tuning such models for specific tasks remains challenging as model size increases, especially with small labeled datasets, which are common in biomedical NLP. We conduct a systematic study on fine-tuning stability in biomedical NLP. We show that fine-tuning performance may be sensitive to pretraining settings and conduct an exploration of techniques for addressing fine-tuning instability. We show that these techniques can substantially improve fine-tuning performance for low-resource biomedical NLP applications. Specifically, freezing lower layers is helpful for standard BERT- [Formula: see text] models, while layerwise decay is more effective for BERT- [Formula: see text] and ELECTRA models. For low-resource text similarity tasks, such as BIOSSES, reinitializing the top layers is the optimal strategy. Overall, domain-specific vocabulary and pretraining facilitate robust models for fine-tuning. Based on these findings, we establish a new state of the art on a wide range of biomedical NLP applications. Elsevier 2023-04-14 /pmc/articles/PMC10140607/ /pubmed/37123444 http://dx.doi.org/10.1016/j.patter.2023.100729 Text en © 2023 The Authors https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Article
Tinn, Robert
Cheng, Hao
Gu, Yu
Usuyama, Naoto
Liu, Xiaodong
Naumann, Tristan
Gao, Jianfeng
Poon, Hoifung
Fine-tuning large neural language models for biomedical natural language processing
title Fine-tuning large neural language models for biomedical natural language processing
title_full Fine-tuning large neural language models for biomedical natural language processing
title_fullStr Fine-tuning large neural language models for biomedical natural language processing
title_full_unstemmed Fine-tuning large neural language models for biomedical natural language processing
title_short Fine-tuning large neural language models for biomedical natural language processing
title_sort fine-tuning large neural language models for biomedical natural language processing
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10140607/
https://www.ncbi.nlm.nih.gov/pubmed/37123444
http://dx.doi.org/10.1016/j.patter.2023.100729
work_keys_str_mv AT tinnrobert finetuninglargeneurallanguagemodelsforbiomedicalnaturallanguageprocessing
AT chenghao finetuninglargeneurallanguagemodelsforbiomedicalnaturallanguageprocessing
AT guyu finetuninglargeneurallanguagemodelsforbiomedicalnaturallanguageprocessing
AT usuyamanaoto finetuninglargeneurallanguagemodelsforbiomedicalnaturallanguageprocessing
AT liuxiaodong finetuninglargeneurallanguagemodelsforbiomedicalnaturallanguageprocessing
AT naumanntristan finetuninglargeneurallanguagemodelsforbiomedicalnaturallanguageprocessing
AT gaojianfeng finetuninglargeneurallanguagemodelsforbiomedicalnaturallanguageprocessing
AT poonhoifung finetuninglargeneurallanguagemodelsforbiomedicalnaturallanguageprocessing