Cargando…

Research on Named Entity Recognition Method of Metro On-Board Equipment Based on Multiheaded Self-Attention Mechanism and CNN-BiLSTM-CRF

Massive and complex unstructured fault text data will be generated during the operation of subway trains. A named entity recognition model of subway on-board equipment based on Multiheaded Self-attention mechanism and CNN-BiLSTM-CRF is proposed to address the issue of low recognition accuracy and in...

Descripción completa

Detalles Bibliográficos
Autores principales: Lin, Junting, Liu, Endong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9279046/
https://www.ncbi.nlm.nih.gov/pubmed/35845883
http://dx.doi.org/10.1155/2022/6374988
Descripción
Sumario:Massive and complex unstructured fault text data will be generated during the operation of subway trains. A named entity recognition model of subway on-board equipment based on Multiheaded Self-attention mechanism and CNN-BiLSTM-CRF is proposed to address the issue of low recognition accuracy and incomplete recognition features of unstructured fault data named entity recognition task of subway on-board equipment: BiLSTM-CNN parallel network extracts context feature information and local attention information, respectively; In the MHA layer, the features learned from different dimensions are fused through the Multiheaded Self-attention mechanism, and the dependencies of various ranges in the sequence are captured to yield the internal structure information of the features. The conditional random field CRF is used to learn the internal relationship between tags to ensure their sequence. This model is tested with other named entity recognition models on the marked subway on-board fault data. The experimental results demonstrate that this model is able to recognize 10 kinds of labels in the dataset. Moreover, the recognition effect of each label has a good performance in the three evaluation indexes of P, R, and F1 score. Moreover, the weighted average evaluation indexes Avg − P, Avg − R, and Avg − F(1) of 10 labels in this model reach the highest 95.39%, 95.48%, and 95.37%, which has high evaluation indexes and can be applied to the named entity recognition of Metro on-board equipment.