Cargando…

Semi-Supervised Bidirectional Long Short-Term Memory and Conditional Random Fields Model for Named-Entity Recognition Using Embeddings from Language Models Representations

Increasingly, popular online museums have significantly changed the way people acquire cultural knowledge. These online museums have been generating abundant amounts of cultural relics data. In recent years, researchers have used deep learning models that can automatically extract complex features a...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zhang, Min, Geng, Guohua, Chen, Jing
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2020
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7516692/ https://www.ncbi.nlm.nih.gov/pubmed/33286026 http://dx.doi.org/10.3390/e22020252

_version_	1783587059684868096
author	Zhang, Min Geng, Guohua Chen, Jing
author_facet	Zhang, Min Geng, Guohua Chen, Jing
author_sort	Zhang, Min
collection	PubMed
description	Increasingly, popular online museums have significantly changed the way people acquire cultural knowledge. These online museums have been generating abundant amounts of cultural relics data. In recent years, researchers have used deep learning models that can automatically extract complex features and have rich representation capabilities to implement named-entity recognition (NER). However, the lack of labeled data in the field of cultural relics makes it difficult for deep learning models that rely on labeled data to achieve excellent performance. To address this problem, this paper proposes a semi-supervised deep learning model named SCRNER (Semi-supervised model for Cultural Relics’ Named Entity Recognition) that utilizes the bidirectional long short-term memory (BiLSTM) and conditional random fields (CRF) model trained by seldom labeled data and abundant unlabeled data to attain an effective performance. To satisfy the semi-supervised sample selection, we propose a repeat-labeled (relabeled) strategy to select samples of high confidence to enlarge the training set iteratively. In addition, we use embeddings from language model (ELMo) representations to dynamically acquire word representations as the input of the model to solve the problem of the blurred boundaries of cultural objects and Chinese characteristics of texts in the field of cultural relics. Experimental results demonstrate that our proposed model, trained on limited labeled data, achieves an effective performance in the task of named entity recognition of cultural relics.
format	Online Article Text
id	pubmed-7516692
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-75166922020-11-09 Semi-Supervised Bidirectional Long Short-Term Memory and Conditional Random Fields Model for Named-Entity Recognition Using Embeddings from Language Models Representations Zhang, Min Geng, Guohua Chen, Jing Entropy (Basel) Article Increasingly, popular online museums have significantly changed the way people acquire cultural knowledge. These online museums have been generating abundant amounts of cultural relics data. In recent years, researchers have used deep learning models that can automatically extract complex features and have rich representation capabilities to implement named-entity recognition (NER). However, the lack of labeled data in the field of cultural relics makes it difficult for deep learning models that rely on labeled data to achieve excellent performance. To address this problem, this paper proposes a semi-supervised deep learning model named SCRNER (Semi-supervised model for Cultural Relics’ Named Entity Recognition) that utilizes the bidirectional long short-term memory (BiLSTM) and conditional random fields (CRF) model trained by seldom labeled data and abundant unlabeled data to attain an effective performance. To satisfy the semi-supervised sample selection, we propose a repeat-labeled (relabeled) strategy to select samples of high confidence to enlarge the training set iteratively. In addition, we use embeddings from language model (ELMo) representations to dynamically acquire word representations as the input of the model to solve the problem of the blurred boundaries of cultural objects and Chinese characteristics of texts in the field of cultural relics. Experimental results demonstrate that our proposed model, trained on limited labeled data, achieves an effective performance in the task of named entity recognition of cultural relics. MDPI 2020-02-22 /pmc/articles/PMC7516692/ /pubmed/33286026 http://dx.doi.org/10.3390/e22020252 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Zhang, Min Geng, Guohua Chen, Jing Semi-Supervised Bidirectional Long Short-Term Memory and Conditional Random Fields Model for Named-Entity Recognition Using Embeddings from Language Models Representations
title	Semi-Supervised Bidirectional Long Short-Term Memory and Conditional Random Fields Model for Named-Entity Recognition Using Embeddings from Language Models Representations
title_full	Semi-Supervised Bidirectional Long Short-Term Memory and Conditional Random Fields Model for Named-Entity Recognition Using Embeddings from Language Models Representations
title_fullStr	Semi-Supervised Bidirectional Long Short-Term Memory and Conditional Random Fields Model for Named-Entity Recognition Using Embeddings from Language Models Representations
title_full_unstemmed	Semi-Supervised Bidirectional Long Short-Term Memory and Conditional Random Fields Model for Named-Entity Recognition Using Embeddings from Language Models Representations
title_short	Semi-Supervised Bidirectional Long Short-Term Memory and Conditional Random Fields Model for Named-Entity Recognition Using Embeddings from Language Models Representations
title_sort	semi-supervised bidirectional long short-term memory and conditional random fields model for named-entity recognition using embeddings from language models representations
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7516692/ https://www.ncbi.nlm.nih.gov/pubmed/33286026 http://dx.doi.org/10.3390/e22020252
work_keys_str_mv	AT zhangmin semisupervisedbidirectionallongshorttermmemoryandconditionalrandomfieldsmodelfornamedentityrecognitionusingembeddingsfromlanguagemodelsrepresentations AT gengguohua semisupervisedbidirectionallongshorttermmemoryandconditionalrandomfieldsmodelfornamedentityrecognitionusingembeddingsfromlanguagemodelsrepresentations AT chenjing semisupervisedbidirectionallongshorttermmemoryandconditionalrandomfieldsmodelfornamedentityrecognitionusingembeddingsfromlanguagemodelsrepresentations

Semi-Supervised Bidirectional Long Short-Term Memory and Conditional Random Fields Model for Named-Entity Recognition Using Embeddings from Language Models Representations

Ejemplares similares