Cargando…
Fine-tuning large neural language models for biomedical natural language processing
Large neural language models have transformed modern natural language processing (NLP) applications. However, fine-tuning such models for specific tasks remains challenging as model size increases, especially with small labeled datasets, which are common in biomedical NLP. We conduct a systematic st...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10140607/ https://www.ncbi.nlm.nih.gov/pubmed/37123444 http://dx.doi.org/10.1016/j.patter.2023.100729 |
_version_ | 1785033199949185024 |
---|---|
author | Tinn, Robert Cheng, Hao Gu, Yu Usuyama, Naoto Liu, Xiaodong Naumann, Tristan Gao, Jianfeng Poon, Hoifung |
author_facet | Tinn, Robert Cheng, Hao Gu, Yu Usuyama, Naoto Liu, Xiaodong Naumann, Tristan Gao, Jianfeng Poon, Hoifung |
author_sort | Tinn, Robert |
collection | PubMed |
description | Large neural language models have transformed modern natural language processing (NLP) applications. However, fine-tuning such models for specific tasks remains challenging as model size increases, especially with small labeled datasets, which are common in biomedical NLP. We conduct a systematic study on fine-tuning stability in biomedical NLP. We show that fine-tuning performance may be sensitive to pretraining settings and conduct an exploration of techniques for addressing fine-tuning instability. We show that these techniques can substantially improve fine-tuning performance for low-resource biomedical NLP applications. Specifically, freezing lower layers is helpful for standard BERT- [Formula: see text] models, while layerwise decay is more effective for BERT- [Formula: see text] and ELECTRA models. For low-resource text similarity tasks, such as BIOSSES, reinitializing the top layers is the optimal strategy. Overall, domain-specific vocabulary and pretraining facilitate robust models for fine-tuning. Based on these findings, we establish a new state of the art on a wide range of biomedical NLP applications. |
format | Online Article Text |
id | pubmed-10140607 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Elsevier |
record_format | MEDLINE/PubMed |
spelling | pubmed-101406072023-04-29 Fine-tuning large neural language models for biomedical natural language processing Tinn, Robert Cheng, Hao Gu, Yu Usuyama, Naoto Liu, Xiaodong Naumann, Tristan Gao, Jianfeng Poon, Hoifung Patterns (N Y) Article Large neural language models have transformed modern natural language processing (NLP) applications. However, fine-tuning such models for specific tasks remains challenging as model size increases, especially with small labeled datasets, which are common in biomedical NLP. We conduct a systematic study on fine-tuning stability in biomedical NLP. We show that fine-tuning performance may be sensitive to pretraining settings and conduct an exploration of techniques for addressing fine-tuning instability. We show that these techniques can substantially improve fine-tuning performance for low-resource biomedical NLP applications. Specifically, freezing lower layers is helpful for standard BERT- [Formula: see text] models, while layerwise decay is more effective for BERT- [Formula: see text] and ELECTRA models. For low-resource text similarity tasks, such as BIOSSES, reinitializing the top layers is the optimal strategy. Overall, domain-specific vocabulary and pretraining facilitate robust models for fine-tuning. Based on these findings, we establish a new state of the art on a wide range of biomedical NLP applications. Elsevier 2023-04-14 /pmc/articles/PMC10140607/ /pubmed/37123444 http://dx.doi.org/10.1016/j.patter.2023.100729 Text en © 2023 The Authors https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). |
spellingShingle | Article Tinn, Robert Cheng, Hao Gu, Yu Usuyama, Naoto Liu, Xiaodong Naumann, Tristan Gao, Jianfeng Poon, Hoifung Fine-tuning large neural language models for biomedical natural language processing |
title | Fine-tuning large neural language models for biomedical natural language processing |
title_full | Fine-tuning large neural language models for biomedical natural language processing |
title_fullStr | Fine-tuning large neural language models for biomedical natural language processing |
title_full_unstemmed | Fine-tuning large neural language models for biomedical natural language processing |
title_short | Fine-tuning large neural language models for biomedical natural language processing |
title_sort | fine-tuning large neural language models for biomedical natural language processing |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10140607/ https://www.ncbi.nlm.nih.gov/pubmed/37123444 http://dx.doi.org/10.1016/j.patter.2023.100729 |
work_keys_str_mv | AT tinnrobert finetuninglargeneurallanguagemodelsforbiomedicalnaturallanguageprocessing AT chenghao finetuninglargeneurallanguagemodelsforbiomedicalnaturallanguageprocessing AT guyu finetuninglargeneurallanguagemodelsforbiomedicalnaturallanguageprocessing AT usuyamanaoto finetuninglargeneurallanguagemodelsforbiomedicalnaturallanguageprocessing AT liuxiaodong finetuninglargeneurallanguagemodelsforbiomedicalnaturallanguageprocessing AT naumanntristan finetuninglargeneurallanguagemodelsforbiomedicalnaturallanguageprocessing AT gaojianfeng finetuninglargeneurallanguagemodelsforbiomedicalnaturallanguageprocessing AT poonhoifung finetuninglargeneurallanguagemodelsforbiomedicalnaturallanguageprocessing |