Cargando…
A Mixed Semantic Features Model for Chinese NER with Characters and Words
Named Entity Recognition (NER) is an essential part of many natural language processing (NLP) tasks. The existing Chinese NER methods are mostly based on word segmentation, or use the character sequences as input. However, using a single granularity representation would suffer from the problems of o...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7148203/ http://dx.doi.org/10.1007/978-3-030-45439-5_24 |
_version_ | 1783520542502944768 |
---|---|
author | Chang, Ning Zhong, Jiang Li, Qing Zhu, Jiang |
author_facet | Chang, Ning Zhong, Jiang Li, Qing Zhu, Jiang |
author_sort | Chang, Ning |
collection | PubMed |
description | Named Entity Recognition (NER) is an essential part of many natural language processing (NLP) tasks. The existing Chinese NER methods are mostly based on word segmentation, or use the character sequences as input. However, using a single granularity representation would suffer from the problems of out-of-vocabulary and word segmentation errors, and the semantic content is relatively simple. In this paper, we introduce the self-attention mechanism into the BiLSTM-CRF neural network structure for Chinese named entity recognition with two embedding. Different from other models, our method combines character and word features at the sequence level, and the attention mechanism computes similarity on the total sequence consisted of characters and words. The character semantic information and the structure of words work together to improve the accuracy of word boundary segmentation and solve the problem of long-phrase combination. We validate our model on MSRA and Weibo corpora, and experiments demonstrate that our model can significantly improve the performance of the Chinese NER task. |
format | Online Article Text |
id | pubmed-7148203 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
record_format | MEDLINE/PubMed |
spelling | pubmed-71482032020-04-13 A Mixed Semantic Features Model for Chinese NER with Characters and Words Chang, Ning Zhong, Jiang Li, Qing Zhu, Jiang Advances in Information Retrieval Article Named Entity Recognition (NER) is an essential part of many natural language processing (NLP) tasks. The existing Chinese NER methods are mostly based on word segmentation, or use the character sequences as input. However, using a single granularity representation would suffer from the problems of out-of-vocabulary and word segmentation errors, and the semantic content is relatively simple. In this paper, we introduce the self-attention mechanism into the BiLSTM-CRF neural network structure for Chinese named entity recognition with two embedding. Different from other models, our method combines character and word features at the sequence level, and the attention mechanism computes similarity on the total sequence consisted of characters and words. The character semantic information and the structure of words work together to improve the accuracy of word boundary segmentation and solve the problem of long-phrase combination. We validate our model on MSRA and Weibo corpora, and experiments demonstrate that our model can significantly improve the performance of the Chinese NER task. 2020-03-17 /pmc/articles/PMC7148203/ http://dx.doi.org/10.1007/978-3-030-45439-5_24 Text en © Springer Nature Switzerland AG 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Article Chang, Ning Zhong, Jiang Li, Qing Zhu, Jiang A Mixed Semantic Features Model for Chinese NER with Characters and Words |
title | A Mixed Semantic Features Model for Chinese NER with Characters and Words |
title_full | A Mixed Semantic Features Model for Chinese NER with Characters and Words |
title_fullStr | A Mixed Semantic Features Model for Chinese NER with Characters and Words |
title_full_unstemmed | A Mixed Semantic Features Model for Chinese NER with Characters and Words |
title_short | A Mixed Semantic Features Model for Chinese NER with Characters and Words |
title_sort | mixed semantic features model for chinese ner with characters and words |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7148203/ http://dx.doi.org/10.1007/978-3-030-45439-5_24 |
work_keys_str_mv | AT changning amixedsemanticfeaturesmodelforchinesenerwithcharactersandwords AT zhongjiang amixedsemanticfeaturesmodelforchinesenerwithcharactersandwords AT liqing amixedsemanticfeaturesmodelforchinesenerwithcharactersandwords AT zhujiang amixedsemanticfeaturesmodelforchinesenerwithcharactersandwords AT changning mixedsemanticfeaturesmodelforchinesenerwithcharactersandwords AT zhongjiang mixedsemanticfeaturesmodelforchinesenerwithcharactersandwords AT liqing mixedsemanticfeaturesmodelforchinesenerwithcharactersandwords AT zhujiang mixedsemanticfeaturesmodelforchinesenerwithcharactersandwords |