Cargando…

Keyphrase Identification Using Minimal Labeled Data with Hierarchical Context and Transfer Learning

Interoperable clinical decision support system (CDSS) rules provide a pathway to interoperability, a well-recognized challenge in health information technology. Building an ontology facilitates creating interoperable CDSS rules, which can be achieved by identifying the keyphrases (KP) from the exist...

Descripción completa

Detalles Bibliográficos
Autores principales: Goli, Rohan, Hubig, Nina, Min, Hua, Gong, Yang, Sittig, Dean F., Rennert, Lior, Robinson, David, Biondich, Paul, Wright, Adam, Nøhr, Christian, Law, Timothy, Faxvaag, Arild, Weaver, Aneesa, Gimbel, Ronald, Jing, Xia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10246160/
https://www.ncbi.nlm.nih.gov/pubmed/37292830
http://dx.doi.org/10.1101/2023.01.26.23285060
_version_ 1785054988365463552
author Goli, Rohan
Hubig, Nina
Min, Hua
Gong, Yang
Sittig, Dean F.
Rennert, Lior
Robinson, David
Biondich, Paul
Wright, Adam
Nøhr, Christian
Law, Timothy
Faxvaag, Arild
Weaver, Aneesa
Gimbel, Ronald
Jing, Xia
author_facet Goli, Rohan
Hubig, Nina
Min, Hua
Gong, Yang
Sittig, Dean F.
Rennert, Lior
Robinson, David
Biondich, Paul
Wright, Adam
Nøhr, Christian
Law, Timothy
Faxvaag, Arild
Weaver, Aneesa
Gimbel, Ronald
Jing, Xia
author_sort Goli, Rohan
collection PubMed
description Interoperable clinical decision support system (CDSS) rules provide a pathway to interoperability, a well-recognized challenge in health information technology. Building an ontology facilitates creating interoperable CDSS rules, which can be achieved by identifying the keyphrases (KP) from the existing literature. However, KP identification for data labeling requires human expertise, consensus, and contextual understanding. This paper aims to present a semi-supervised KP identification framework using minimal labeled data based on hierarchical attention over the documents and domain adaptation. Our method outperforms the prior neural architectures by learning through synthetic labels for initial training, document-level contextual learning, language modeling, and fine-tuning with limited gold standard label data. To the best of our knowledge, this is the first functional framework for the CDSS sub-domain to identify KPs, which is trained on limited labeled data. It contributes to the general natural language processing (NLP) architectures in areas such as clinical NLP, where manual data labeling is challenging, and light-weighted deep learning models play a role in real-time KP identification as a complementary approach to human experts’ effort.
format Online
Article
Text
id pubmed-10246160
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-102461602023-06-08 Keyphrase Identification Using Minimal Labeled Data with Hierarchical Context and Transfer Learning Goli, Rohan Hubig, Nina Min, Hua Gong, Yang Sittig, Dean F. Rennert, Lior Robinson, David Biondich, Paul Wright, Adam Nøhr, Christian Law, Timothy Faxvaag, Arild Weaver, Aneesa Gimbel, Ronald Jing, Xia medRxiv Article Interoperable clinical decision support system (CDSS) rules provide a pathway to interoperability, a well-recognized challenge in health information technology. Building an ontology facilitates creating interoperable CDSS rules, which can be achieved by identifying the keyphrases (KP) from the existing literature. However, KP identification for data labeling requires human expertise, consensus, and contextual understanding. This paper aims to present a semi-supervised KP identification framework using minimal labeled data based on hierarchical attention over the documents and domain adaptation. Our method outperforms the prior neural architectures by learning through synthetic labels for initial training, document-level contextual learning, language modeling, and fine-tuning with limited gold standard label data. To the best of our knowledge, this is the first functional framework for the CDSS sub-domain to identify KPs, which is trained on limited labeled data. It contributes to the general natural language processing (NLP) architectures in areas such as clinical NLP, where manual data labeling is challenging, and light-weighted deep learning models play a role in real-time KP identification as a complementary approach to human experts’ effort. Cold Spring Harbor Laboratory 2023-05-26 /pmc/articles/PMC10246160/ /pubmed/37292830 http://dx.doi.org/10.1101/2023.01.26.23285060 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator.
spellingShingle Article
Goli, Rohan
Hubig, Nina
Min, Hua
Gong, Yang
Sittig, Dean F.
Rennert, Lior
Robinson, David
Biondich, Paul
Wright, Adam
Nøhr, Christian
Law, Timothy
Faxvaag, Arild
Weaver, Aneesa
Gimbel, Ronald
Jing, Xia
Keyphrase Identification Using Minimal Labeled Data with Hierarchical Context and Transfer Learning
title Keyphrase Identification Using Minimal Labeled Data with Hierarchical Context and Transfer Learning
title_full Keyphrase Identification Using Minimal Labeled Data with Hierarchical Context and Transfer Learning
title_fullStr Keyphrase Identification Using Minimal Labeled Data with Hierarchical Context and Transfer Learning
title_full_unstemmed Keyphrase Identification Using Minimal Labeled Data with Hierarchical Context and Transfer Learning
title_short Keyphrase Identification Using Minimal Labeled Data with Hierarchical Context and Transfer Learning
title_sort keyphrase identification using minimal labeled data with hierarchical context and transfer learning
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10246160/
https://www.ncbi.nlm.nih.gov/pubmed/37292830
http://dx.doi.org/10.1101/2023.01.26.23285060
work_keys_str_mv AT golirohan keyphraseidentificationusingminimallabeleddatawithhierarchicalcontextandtransferlearning
AT hubignina keyphraseidentificationusingminimallabeleddatawithhierarchicalcontextandtransferlearning
AT minhua keyphraseidentificationusingminimallabeleddatawithhierarchicalcontextandtransferlearning
AT gongyang keyphraseidentificationusingminimallabeleddatawithhierarchicalcontextandtransferlearning
AT sittigdeanf keyphraseidentificationusingminimallabeleddatawithhierarchicalcontextandtransferlearning
AT rennertlior keyphraseidentificationusingminimallabeleddatawithhierarchicalcontextandtransferlearning
AT robinsondavid keyphraseidentificationusingminimallabeleddatawithhierarchicalcontextandtransferlearning
AT biondichpaul keyphraseidentificationusingminimallabeleddatawithhierarchicalcontextandtransferlearning
AT wrightadam keyphraseidentificationusingminimallabeleddatawithhierarchicalcontextandtransferlearning
AT nøhrchristian keyphraseidentificationusingminimallabeleddatawithhierarchicalcontextandtransferlearning
AT lawtimothy keyphraseidentificationusingminimallabeleddatawithhierarchicalcontextandtransferlearning
AT faxvaagarild keyphraseidentificationusingminimallabeleddatawithhierarchicalcontextandtransferlearning
AT weaveraneesa keyphraseidentificationusingminimallabeleddatawithhierarchicalcontextandtransferlearning
AT gimbelronald keyphraseidentificationusingminimallabeleddatawithhierarchicalcontextandtransferlearning
AT jingxia keyphraseidentificationusingminimallabeleddatawithhierarchicalcontextandtransferlearning