Cargando…

CLART: A cascaded lattice-and-radical transformer network for Chinese medical named entity recognition

Chinese medical named entity recognition (NER) is a fundamental task in Chinese medical natural language processing, aiming to recognize Chinese medical entities within unstructured medical texts. However, it poses significant challenges mainly due to the extensive usage of medical terms in Chinese...

Descripción completa

Detalles Bibliográficos
Autores principales: Xiao, Yinlong, Ji, Zongcheng, Li, Jianqiang, Zhu, Qing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10590790/
https://www.ncbi.nlm.nih.gov/pubmed/37876457
http://dx.doi.org/10.1016/j.heliyon.2023.e20692
Descripción
Sumario:Chinese medical named entity recognition (NER) is a fundamental task in Chinese medical natural language processing, aiming to recognize Chinese medical entities within unstructured medical texts. However, it poses significant challenges mainly due to the extensive usage of medical terms in Chinese medical texts. Although previous studies have made attempts to incorporate lexical or radical knowledge in order to improve the comprehension of medical texts, these studies either focus solely on one of these aspects or utilize a basic concatenation operation to combine these features, which fails to fully utilize the potential of lexical and radical knowledge. In this paper, we propose a novel Cascaded LAttice-and-Radical Transformer (CLART) network to exploit both lexical and radical information for Chinese medical NER. Specifically, given a sentence, a medical lexicon, and a radical dictionary, we first construct a flat lattice (i.e., character-word sequence) for the sentence and radical components of each Chinese character through word matching and radical parsing, respectively. We then employ a lattice Transformer module to capture the dense interactions between characters and matched words, facilitating the enhanced utilization of lexical knowledge. Subsequently, we design a radical Transformer module to model the dense interactions between the lattice and radical features, facilitating better fusion of the lexical and radical knowledge. Finally, we feed the updated lattice-and-radical-aware character representations into a Conditional Random Fields (CRF) decoder to obtain the predicted labels. Experimental results conducted on two publicly available Chinese medical NER datasets show the effectiveness of the proposed method.