Cargando…

Combining Contextualized Embeddings and Prior Knowledge for Clinical Named Entity Recognition: Evaluation Study

BACKGROUND: Named entity recognition (NER) is a key step in clinical natural language processing (NLP). Traditionally, rule-based systems leverage prior knowledge to define rules to identify named entities. Recently, deep learning–based NER systems have become more and more popular. Contextualized w...

Descripción completa

Detalles Bibliográficos
Autores principales:	Jiang, Min, Sanger, Todd, Liu, Xiong
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	JMIR Publications 2019
Materias:	Original Paper
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6913757/ https://www.ncbi.nlm.nih.gov/pubmed/31719024 http://dx.doi.org/10.2196/14850

_version_	1783479696697065472
author	Jiang, Min Sanger, Todd Liu, Xiong
author_facet	Jiang, Min Sanger, Todd Liu, Xiong
author_sort	Jiang, Min
collection	PubMed
description	BACKGROUND: Named entity recognition (NER) is a key step in clinical natural language processing (NLP). Traditionally, rule-based systems leverage prior knowledge to define rules to identify named entities. Recently, deep learning–based NER systems have become more and more popular. Contextualized word embedding, as a new type of representation of the word, has been proposed to dynamically capture word sense using context information and has proven successful in many deep learning–based systems in either general domain or medical domain. However, there are very few studies that investigate the effects of combining multiple contextualized embeddings and prior knowledge on the clinical NER task. OBJECTIVE: This study aims to improve the performance of NER in clinical text by combining multiple contextual embeddings and prior knowledge. METHODS: In this study, we investigate the effects of combining multiple contextualized word embeddings with classic word embedding in deep neural networks to predict named entities in clinical text. We also investigate whether using a semantic lexicon could further improve the performance of the clinical NER system. RESULTS: By combining contextualized embeddings such as ELMo and Flair, our system achieves the F-1 score of 87.30% when only training based on a portion of the 2010 Informatics for Integrating Biology and the Bedside NER task dataset. After incorporating the medical lexicon into the word embedding, the F-1 score was further increased to 87.44%. Another finding was that our system still could achieve an F-1 score of 85.36% when the size of the training data was reduced to 40%. CONCLUSIONS: Combined contextualized embedding could be beneficial for the clinical NER task. Moreover, the semantic lexicon could be used to further improve the performance of the clinical NER system.
format	Online Article Text
id	pubmed-6913757
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	JMIR Publications
record_format	MEDLINE/PubMed
spelling	pubmed-69137572020-01-02 Combining Contextualized Embeddings and Prior Knowledge for Clinical Named Entity Recognition: Evaluation Study Jiang, Min Sanger, Todd Liu, Xiong JMIR Med Inform Original Paper BACKGROUND: Named entity recognition (NER) is a key step in clinical natural language processing (NLP). Traditionally, rule-based systems leverage prior knowledge to define rules to identify named entities. Recently, deep learning–based NER systems have become more and more popular. Contextualized word embedding, as a new type of representation of the word, has been proposed to dynamically capture word sense using context information and has proven successful in many deep learning–based systems in either general domain or medical domain. However, there are very few studies that investigate the effects of combining multiple contextualized embeddings and prior knowledge on the clinical NER task. OBJECTIVE: This study aims to improve the performance of NER in clinical text by combining multiple contextual embeddings and prior knowledge. METHODS: In this study, we investigate the effects of combining multiple contextualized word embeddings with classic word embedding in deep neural networks to predict named entities in clinical text. We also investigate whether using a semantic lexicon could further improve the performance of the clinical NER system. RESULTS: By combining contextualized embeddings such as ELMo and Flair, our system achieves the F-1 score of 87.30% when only training based on a portion of the 2010 Informatics for Integrating Biology and the Bedside NER task dataset. After incorporating the medical lexicon into the word embedding, the F-1 score was further increased to 87.44%. Another finding was that our system still could achieve an F-1 score of 85.36% when the size of the training data was reduced to 40%. CONCLUSIONS: Combined contextualized embedding could be beneficial for the clinical NER task. Moreover, the semantic lexicon could be used to further improve the performance of the clinical NER system. JMIR Publications 2019-11-13 /pmc/articles/PMC6913757/ /pubmed/31719024 http://dx.doi.org/10.2196/14850 Text en ©Min Jiang, Todd Sanger, Xiong Liu. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 13.11.2019. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on http://medinform.jmir.org/, as well as this copyright and license information must be included.
spellingShingle	Original Paper Jiang, Min Sanger, Todd Liu, Xiong Combining Contextualized Embeddings and Prior Knowledge for Clinical Named Entity Recognition: Evaluation Study
title	Combining Contextualized Embeddings and Prior Knowledge for Clinical Named Entity Recognition: Evaluation Study
title_full	Combining Contextualized Embeddings and Prior Knowledge for Clinical Named Entity Recognition: Evaluation Study
title_fullStr	Combining Contextualized Embeddings and Prior Knowledge for Clinical Named Entity Recognition: Evaluation Study
title_full_unstemmed	Combining Contextualized Embeddings and Prior Knowledge for Clinical Named Entity Recognition: Evaluation Study
title_short	Combining Contextualized Embeddings and Prior Knowledge for Clinical Named Entity Recognition: Evaluation Study
title_sort	combining contextualized embeddings and prior knowledge for clinical named entity recognition: evaluation study
topic	Original Paper
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6913757/ https://www.ncbi.nlm.nih.gov/pubmed/31719024 http://dx.doi.org/10.2196/14850
work_keys_str_mv	AT jiangmin combiningcontextualizedembeddingsandpriorknowledgeforclinicalnamedentityrecognitionevaluationstudy AT sangertodd combiningcontextualizedembeddingsandpriorknowledgeforclinicalnamedentityrecognitionevaluationstudy AT liuxiong combiningcontextualizedembeddingsandpriorknowledgeforclinicalnamedentityrecognitionevaluationstudy

Combining Contextualized Embeddings and Prior Knowledge for Clinical Named Entity Recognition: Evaluation Study

Ejemplares similares