Cargando…
Keyphrase Identification Using Minimal Labeled Data with Hierarchical Context and Transfer Learning
Interoperable clinical decision support system (CDSS) rules provide a pathway to interoperability, a well-recognized challenge in health information technology. Building an ontology facilitates creating interoperable CDSS rules, which can be achieved by identifying the keyphrases (KP) from the exist...
Autores principales: | , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10246160/ https://www.ncbi.nlm.nih.gov/pubmed/37292830 http://dx.doi.org/10.1101/2023.01.26.23285060 |
_version_ | 1785054988365463552 |
---|---|
author | Goli, Rohan Hubig, Nina Min, Hua Gong, Yang Sittig, Dean F. Rennert, Lior Robinson, David Biondich, Paul Wright, Adam Nøhr, Christian Law, Timothy Faxvaag, Arild Weaver, Aneesa Gimbel, Ronald Jing, Xia |
author_facet | Goli, Rohan Hubig, Nina Min, Hua Gong, Yang Sittig, Dean F. Rennert, Lior Robinson, David Biondich, Paul Wright, Adam Nøhr, Christian Law, Timothy Faxvaag, Arild Weaver, Aneesa Gimbel, Ronald Jing, Xia |
author_sort | Goli, Rohan |
collection | PubMed |
description | Interoperable clinical decision support system (CDSS) rules provide a pathway to interoperability, a well-recognized challenge in health information technology. Building an ontology facilitates creating interoperable CDSS rules, which can be achieved by identifying the keyphrases (KP) from the existing literature. However, KP identification for data labeling requires human expertise, consensus, and contextual understanding. This paper aims to present a semi-supervised KP identification framework using minimal labeled data based on hierarchical attention over the documents and domain adaptation. Our method outperforms the prior neural architectures by learning through synthetic labels for initial training, document-level contextual learning, language modeling, and fine-tuning with limited gold standard label data. To the best of our knowledge, this is the first functional framework for the CDSS sub-domain to identify KPs, which is trained on limited labeled data. It contributes to the general natural language processing (NLP) architectures in areas such as clinical NLP, where manual data labeling is challenging, and light-weighted deep learning models play a role in real-time KP identification as a complementary approach to human experts’ effort. |
format | Online Article Text |
id | pubmed-10246160 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Cold Spring Harbor Laboratory |
record_format | MEDLINE/PubMed |
spelling | pubmed-102461602023-06-08 Keyphrase Identification Using Minimal Labeled Data with Hierarchical Context and Transfer Learning Goli, Rohan Hubig, Nina Min, Hua Gong, Yang Sittig, Dean F. Rennert, Lior Robinson, David Biondich, Paul Wright, Adam Nøhr, Christian Law, Timothy Faxvaag, Arild Weaver, Aneesa Gimbel, Ronald Jing, Xia medRxiv Article Interoperable clinical decision support system (CDSS) rules provide a pathway to interoperability, a well-recognized challenge in health information technology. Building an ontology facilitates creating interoperable CDSS rules, which can be achieved by identifying the keyphrases (KP) from the existing literature. However, KP identification for data labeling requires human expertise, consensus, and contextual understanding. This paper aims to present a semi-supervised KP identification framework using minimal labeled data based on hierarchical attention over the documents and domain adaptation. Our method outperforms the prior neural architectures by learning through synthetic labels for initial training, document-level contextual learning, language modeling, and fine-tuning with limited gold standard label data. To the best of our knowledge, this is the first functional framework for the CDSS sub-domain to identify KPs, which is trained on limited labeled data. It contributes to the general natural language processing (NLP) architectures in areas such as clinical NLP, where manual data labeling is challenging, and light-weighted deep learning models play a role in real-time KP identification as a complementary approach to human experts’ effort. Cold Spring Harbor Laboratory 2023-05-26 /pmc/articles/PMC10246160/ /pubmed/37292830 http://dx.doi.org/10.1101/2023.01.26.23285060 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator. |
spellingShingle | Article Goli, Rohan Hubig, Nina Min, Hua Gong, Yang Sittig, Dean F. Rennert, Lior Robinson, David Biondich, Paul Wright, Adam Nøhr, Christian Law, Timothy Faxvaag, Arild Weaver, Aneesa Gimbel, Ronald Jing, Xia Keyphrase Identification Using Minimal Labeled Data with Hierarchical Context and Transfer Learning |
title | Keyphrase Identification Using Minimal Labeled Data with Hierarchical Context and Transfer Learning |
title_full | Keyphrase Identification Using Minimal Labeled Data with Hierarchical Context and Transfer Learning |
title_fullStr | Keyphrase Identification Using Minimal Labeled Data with Hierarchical Context and Transfer Learning |
title_full_unstemmed | Keyphrase Identification Using Minimal Labeled Data with Hierarchical Context and Transfer Learning |
title_short | Keyphrase Identification Using Minimal Labeled Data with Hierarchical Context and Transfer Learning |
title_sort | keyphrase identification using minimal labeled data with hierarchical context and transfer learning |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10246160/ https://www.ncbi.nlm.nih.gov/pubmed/37292830 http://dx.doi.org/10.1101/2023.01.26.23285060 |
work_keys_str_mv | AT golirohan keyphraseidentificationusingminimallabeleddatawithhierarchicalcontextandtransferlearning AT hubignina keyphraseidentificationusingminimallabeleddatawithhierarchicalcontextandtransferlearning AT minhua keyphraseidentificationusingminimallabeleddatawithhierarchicalcontextandtransferlearning AT gongyang keyphraseidentificationusingminimallabeleddatawithhierarchicalcontextandtransferlearning AT sittigdeanf keyphraseidentificationusingminimallabeleddatawithhierarchicalcontextandtransferlearning AT rennertlior keyphraseidentificationusingminimallabeleddatawithhierarchicalcontextandtransferlearning AT robinsondavid keyphraseidentificationusingminimallabeleddatawithhierarchicalcontextandtransferlearning AT biondichpaul keyphraseidentificationusingminimallabeleddatawithhierarchicalcontextandtransferlearning AT wrightadam keyphraseidentificationusingminimallabeleddatawithhierarchicalcontextandtransferlearning AT nøhrchristian keyphraseidentificationusingminimallabeleddatawithhierarchicalcontextandtransferlearning AT lawtimothy keyphraseidentificationusingminimallabeleddatawithhierarchicalcontextandtransferlearning AT faxvaagarild keyphraseidentificationusingminimallabeleddatawithhierarchicalcontextandtransferlearning AT weaveraneesa keyphraseidentificationusingminimallabeleddatawithhierarchicalcontextandtransferlearning AT gimbelronald keyphraseidentificationusingminimallabeleddatawithhierarchicalcontextandtransferlearning AT jingxia keyphraseidentificationusingminimallabeleddatawithhierarchicalcontextandtransferlearning |