Cargando…
Rethinking domain adaptation for machine learning over clinical language
Building clinical natural language processing (NLP) systems that work on widely varying data is an absolute necessity because of the expense of obtaining new training data. While domain adaptation research can have a positive impact on this problem, the most widely studied paradigms do not take into...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7382626/ https://www.ncbi.nlm.nih.gov/pubmed/32734151 http://dx.doi.org/10.1093/jamiaopen/ooaa010 |
_version_ | 1783563281566269440 |
---|---|
author | Laparra, Egoitz Bethard, Steven Miller, Timothy A |
author_facet | Laparra, Egoitz Bethard, Steven Miller, Timothy A |
author_sort | Laparra, Egoitz |
collection | PubMed |
description | Building clinical natural language processing (NLP) systems that work on widely varying data is an absolute necessity because of the expense of obtaining new training data. While domain adaptation research can have a positive impact on this problem, the most widely studied paradigms do not take into account the realities of clinical data sharing. To address this issue, we lay out a taxonomy of domain adaptation, parameterizing by what data is shareable. We show that the most realistic settings for clinical use cases are seriously under-studied. To support research in these important directions, we make a series of recommendations, not just for domain adaptation but for clinical NLP in general, that ensure that data, shared tasks, and released models are broadly useful, and that initiate research directions where the clinical NLP community can lead the broader NLP and machine learning fields. |
format | Online Article Text |
id | pubmed-7382626 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-73826262020-07-29 Rethinking domain adaptation for machine learning over clinical language Laparra, Egoitz Bethard, Steven Miller, Timothy A JAMIA Open Perspective Building clinical natural language processing (NLP) systems that work on widely varying data is an absolute necessity because of the expense of obtaining new training data. While domain adaptation research can have a positive impact on this problem, the most widely studied paradigms do not take into account the realities of clinical data sharing. To address this issue, we lay out a taxonomy of domain adaptation, parameterizing by what data is shareable. We show that the most realistic settings for clinical use cases are seriously under-studied. To support research in these important directions, we make a series of recommendations, not just for domain adaptation but for clinical NLP in general, that ensure that data, shared tasks, and released models are broadly useful, and that initiate research directions where the clinical NLP community can lead the broader NLP and machine learning fields. Oxford University Press 2020-04-13 /pmc/articles/PMC7382626/ /pubmed/32734151 http://dx.doi.org/10.1093/jamiaopen/ooaa010 Text en © The Author(s) 2020. Published by Oxford University Press on behalf of the American Medical Informatics Association. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Perspective Laparra, Egoitz Bethard, Steven Miller, Timothy A Rethinking domain adaptation for machine learning over clinical language |
title | Rethinking domain adaptation for machine learning over clinical language |
title_full | Rethinking domain adaptation for machine learning over clinical language |
title_fullStr | Rethinking domain adaptation for machine learning over clinical language |
title_full_unstemmed | Rethinking domain adaptation for machine learning over clinical language |
title_short | Rethinking domain adaptation for machine learning over clinical language |
title_sort | rethinking domain adaptation for machine learning over clinical language |
topic | Perspective |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7382626/ https://www.ncbi.nlm.nih.gov/pubmed/32734151 http://dx.doi.org/10.1093/jamiaopen/ooaa010 |
work_keys_str_mv | AT laparraegoitz rethinkingdomainadaptationformachinelearningoverclinicallanguage AT bethardsteven rethinkingdomainadaptationformachinelearningoverclinicallanguage AT millertimothya rethinkingdomainadaptationformachinelearningoverclinicallanguage |