Cargando…

Fine-tuning large neural language models for biomedical natural language processing

Large neural language models have transformed modern natural language processing (NLP) applications. However, fine-tuning such models for specific tasks remains challenging as model size increases, especially with small labeled datasets, which are common in biomedical NLP. We conduct a systematic st...

Descripción completa

Detalles Bibliográficos
Autores principales:	Tinn, Robert, Cheng, Hao, Gu, Yu, Usuyama, Naoto, Liu, Xiaodong, Naumann, Tristan, Gao, Jianfeng, Poon, Hoifung
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Elsevier 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10140607/ https://www.ncbi.nlm.nih.gov/pubmed/37123444 http://dx.doi.org/10.1016/j.patter.2023.100729

Descripción
Sumario:	Large neural language models have transformed modern natural language processing (NLP) applications. However, fine-tuning such models for specific tasks remains challenging as model size increases, especially with small labeled datasets, which are common in biomedical NLP. We conduct a systematic study on fine-tuning stability in biomedical NLP. We show that fine-tuning performance may be sensitive to pretraining settings and conduct an exploration of techniques for addressing fine-tuning instability. We show that these techniques can substantially improve fine-tuning performance for low-resource biomedical NLP applications. Specifically, freezing lower layers is helpful for standard BERT- [Formula: see text] models, while layerwise decay is more effective for BERT- [Formula: see text] and ELECTRA models. For low-resource text similarity tasks, such as BIOSSES, reinitializing the top layers is the optimal strategy. Overall, domain-specific vocabulary and pretraining facilitate robust models for fine-tuning. Based on these findings, we establish a new state of the art on a wide range of biomedical NLP applications.

Fine-tuning large neural language models for biomedical natural language processing

Ejemplares similares