Cargando…
A Review of Recent Work in Transfer Learning and Domain Adaptation for Natural Language Processing of Electronic Health Records
Objectives: We survey recent work in biomedical NLP on building more adaptable or generalizable models, with a focus on work dealing with electronic health record (EHR) texts, to better understand recent trends in this area and identify opportunities for future research. Methods: We searched PubMed,...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Georg Thieme Verlag KG
2021
|
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8416218/ https://www.ncbi.nlm.nih.gov/pubmed/34479396 http://dx.doi.org/10.1055/s-0041-1726522 |
_version_ | 1783748133839175680 |
---|---|
author | Laparra, Egoitz Mascio, Aurelie Velupillai, Sumithra Miller, Timothy |
author_facet | Laparra, Egoitz Mascio, Aurelie Velupillai, Sumithra Miller, Timothy |
author_sort | Laparra, Egoitz |
collection | PubMed |
description | Objectives: We survey recent work in biomedical NLP on building more adaptable or generalizable models, with a focus on work dealing with electronic health record (EHR) texts, to better understand recent trends in this area and identify opportunities for future research. Methods: We searched PubMed, the Institute of Electrical and Electronics Engineers (IEEE), the Association for Computational Linguistics (ACL) anthology, the Association for the Advancement of Artificial Intelligence (AAAI) proceedings, and Google Scholar for the years 2018-2020. We reviewed abstracts to identify the most relevant and impactful work, and manually extracted data points from each of these papers to characterize the types of methods and tasks that were studied, in which clinical domains, and current state-of-the-art results. Results: The ubiquity of pre-trained transformers in clinical NLP research has contributed to an increase in domain adaptation and generalization-focused work that uses these models as the key component. Most recently, work has started to train biomedical transformers and to extend the fine-tuning process with additional domain adaptation techniques. We also highlight recent research in cross-lingual adaptation, as a special case of adaptation. Conclusions: While pre-trained transformer models have led to some large performance improvements, general domain pre-training does not always transfer adequately to the clinical domain due to its highly specialized language. There is also much work to be done in showing that the gains obtained by pre-trained transformers are beneficial in real world use cases. The amount of work in domain adaptation and transfer learning is limited by dataset availability and creating datasets for new domains is challenging. The growing body of research in languages other than English is encouraging, and more collaboration between researchers across the language divide would likely accelerate progress in non-English clinical NLP. |
format | Online Article Text |
id | pubmed-8416218 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Georg Thieme Verlag KG |
record_format | MEDLINE/PubMed |
spelling | pubmed-84162182021-09-07 A Review of Recent Work in Transfer Learning and Domain Adaptation for Natural Language Processing of Electronic Health Records Laparra, Egoitz Mascio, Aurelie Velupillai, Sumithra Miller, Timothy Yearb Med Inform Objectives: We survey recent work in biomedical NLP on building more adaptable or generalizable models, with a focus on work dealing with electronic health record (EHR) texts, to better understand recent trends in this area and identify opportunities for future research. Methods: We searched PubMed, the Institute of Electrical and Electronics Engineers (IEEE), the Association for Computational Linguistics (ACL) anthology, the Association for the Advancement of Artificial Intelligence (AAAI) proceedings, and Google Scholar for the years 2018-2020. We reviewed abstracts to identify the most relevant and impactful work, and manually extracted data points from each of these papers to characterize the types of methods and tasks that were studied, in which clinical domains, and current state-of-the-art results. Results: The ubiquity of pre-trained transformers in clinical NLP research has contributed to an increase in domain adaptation and generalization-focused work that uses these models as the key component. Most recently, work has started to train biomedical transformers and to extend the fine-tuning process with additional domain adaptation techniques. We also highlight recent research in cross-lingual adaptation, as a special case of adaptation. Conclusions: While pre-trained transformer models have led to some large performance improvements, general domain pre-training does not always transfer adequately to the clinical domain due to its highly specialized language. There is also much work to be done in showing that the gains obtained by pre-trained transformers are beneficial in real world use cases. The amount of work in domain adaptation and transfer learning is limited by dataset availability and creating datasets for new domains is challenging. The growing body of research in languages other than English is encouraging, and more collaboration between researchers across the language divide would likely accelerate progress in non-English clinical NLP. Georg Thieme Verlag KG 2021-08 2021-09-03 /pmc/articles/PMC8416218/ /pubmed/34479396 http://dx.doi.org/10.1055/s-0041-1726522 Text en IMIA and Thieme. This is an open access article published by Thieme under the terms of the Creative Commons Attribution-NonDerivative-NonCommercial License, permitting copying and reproduction so long as the original work is given appropriate credit. Contents may not be used for commercial purposes, or adapted, remixed, transformed or built upon. ( https://creativecommons.org/licenses/by-nc-nd/4.0/ ) https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives License, which permits unrestricted reproduction and distribution, for non-commercial purposes only; and use and reproduction, but not distribution, of adapted material for non-commercial purposes only, provided the original work is properly cited. |
spellingShingle | Laparra, Egoitz Mascio, Aurelie Velupillai, Sumithra Miller, Timothy A Review of Recent Work in Transfer Learning and Domain Adaptation for Natural Language Processing of Electronic Health Records |
title | A Review of Recent Work in Transfer Learning and Domain Adaptation for Natural Language Processing of Electronic Health Records |
title_full | A Review of Recent Work in Transfer Learning and Domain Adaptation for Natural Language Processing of Electronic Health Records |
title_fullStr | A Review of Recent Work in Transfer Learning and Domain Adaptation for Natural Language Processing of Electronic Health Records |
title_full_unstemmed | A Review of Recent Work in Transfer Learning and Domain Adaptation for Natural Language Processing of Electronic Health Records |
title_short | A Review of Recent Work in Transfer Learning and Domain Adaptation for Natural Language Processing of Electronic Health Records |
title_sort | review of recent work in transfer learning and domain adaptation for natural language processing of electronic health records |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8416218/ https://www.ncbi.nlm.nih.gov/pubmed/34479396 http://dx.doi.org/10.1055/s-0041-1726522 |
work_keys_str_mv | AT laparraegoitz areviewofrecentworkintransferlearninganddomainadaptationfornaturallanguageprocessingofelectronichealthrecords AT mascioaurelie areviewofrecentworkintransferlearninganddomainadaptationfornaturallanguageprocessingofelectronichealthrecords AT velupillaisumithra areviewofrecentworkintransferlearninganddomainadaptationfornaturallanguageprocessingofelectronichealthrecords AT millertimothy areviewofrecentworkintransferlearninganddomainadaptationfornaturallanguageprocessingofelectronichealthrecords AT laparraegoitz reviewofrecentworkintransferlearninganddomainadaptationfornaturallanguageprocessingofelectronichealthrecords AT mascioaurelie reviewofrecentworkintransferlearninganddomainadaptationfornaturallanguageprocessingofelectronichealthrecords AT velupillaisumithra reviewofrecentworkintransferlearninganddomainadaptationfornaturallanguageprocessingofelectronichealthrecords AT millertimothy reviewofrecentworkintransferlearninganddomainadaptationfornaturallanguageprocessingofelectronichealthrecords |