Cargando…

SBLC: a hybrid model for disease named entity recognition based on semantic bidirectional LSTMs and conditional random fields

BACKGROUND: Disease named entity recognition (NER) is a fundamental step in information processing of medical texts. However, disease NER involves complex issues such as descriptive modifiers in actual practice. The accurate identification of disease NER is a still an open and essential research pro...

Descripción completa

Detalles Bibliográficos
Autores principales: Xu, Kai, Zhou, Zhanfan, Gong, Tao, Hao, Tianyong, Liu, Wenyin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6284263/
https://www.ncbi.nlm.nih.gov/pubmed/30526592
http://dx.doi.org/10.1186/s12911-018-0690-y
_version_ 1783379303044481024
author Xu, Kai
Zhou, Zhanfan
Gong, Tao
Hao, Tianyong
Liu, Wenyin
author_facet Xu, Kai
Zhou, Zhanfan
Gong, Tao
Hao, Tianyong
Liu, Wenyin
author_sort Xu, Kai
collection PubMed
description BACKGROUND: Disease named entity recognition (NER) is a fundamental step in information processing of medical texts. However, disease NER involves complex issues such as descriptive modifiers in actual practice. The accurate identification of disease NER is a still an open and essential research problem in medical information extraction and text mining tasks. METHODS: A hybrid model named Semantics Bidirectional LSTM and CRF (SBLC) for disease named entity recognition task is proposed. The model leverages word embeddings, Bidirectional Long Short Term Memory networks and Conditional Random Fields. A publically available NCBI disease dataset is applied to evaluate the model through comparing with nine state-of-the-art baseline methods including cTAKES, MetaMap, DNorm, C-Bi-LSTM-CRF, TaggerOne and DNER. RESULTS: The results show that the SBLC model achieves an F1 score of 0.862 and outperforms the other methods. In addition, the model does not rely on external domain dictionaries, thus it can be more conveniently applied in many aspects of medical text processing. CONCLUSIONS: According to performance comparison, the proposed SBLC model achieved the best performance, demonstrating its effectiveness in disease named entity recognition.
format Online
Article
Text
id pubmed-6284263
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-62842632018-12-14 SBLC: a hybrid model for disease named entity recognition based on semantic bidirectional LSTMs and conditional random fields Xu, Kai Zhou, Zhanfan Gong, Tao Hao, Tianyong Liu, Wenyin BMC Med Inform Decis Mak Research BACKGROUND: Disease named entity recognition (NER) is a fundamental step in information processing of medical texts. However, disease NER involves complex issues such as descriptive modifiers in actual practice. The accurate identification of disease NER is a still an open and essential research problem in medical information extraction and text mining tasks. METHODS: A hybrid model named Semantics Bidirectional LSTM and CRF (SBLC) for disease named entity recognition task is proposed. The model leverages word embeddings, Bidirectional Long Short Term Memory networks and Conditional Random Fields. A publically available NCBI disease dataset is applied to evaluate the model through comparing with nine state-of-the-art baseline methods including cTAKES, MetaMap, DNorm, C-Bi-LSTM-CRF, TaggerOne and DNER. RESULTS: The results show that the SBLC model achieves an F1 score of 0.862 and outperforms the other methods. In addition, the model does not rely on external domain dictionaries, thus it can be more conveniently applied in many aspects of medical text processing. CONCLUSIONS: According to performance comparison, the proposed SBLC model achieved the best performance, demonstrating its effectiveness in disease named entity recognition. BioMed Central 2018-12-07 /pmc/articles/PMC6284263/ /pubmed/30526592 http://dx.doi.org/10.1186/s12911-018-0690-y Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Xu, Kai
Zhou, Zhanfan
Gong, Tao
Hao, Tianyong
Liu, Wenyin
SBLC: a hybrid model for disease named entity recognition based on semantic bidirectional LSTMs and conditional random fields
title SBLC: a hybrid model for disease named entity recognition based on semantic bidirectional LSTMs and conditional random fields
title_full SBLC: a hybrid model for disease named entity recognition based on semantic bidirectional LSTMs and conditional random fields
title_fullStr SBLC: a hybrid model for disease named entity recognition based on semantic bidirectional LSTMs and conditional random fields
title_full_unstemmed SBLC: a hybrid model for disease named entity recognition based on semantic bidirectional LSTMs and conditional random fields
title_short SBLC: a hybrid model for disease named entity recognition based on semantic bidirectional LSTMs and conditional random fields
title_sort sblc: a hybrid model for disease named entity recognition based on semantic bidirectional lstms and conditional random fields
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6284263/
https://www.ncbi.nlm.nih.gov/pubmed/30526592
http://dx.doi.org/10.1186/s12911-018-0690-y
work_keys_str_mv AT xukai sblcahybridmodelfordiseasenamedentityrecognitionbasedonsemanticbidirectionallstmsandconditionalrandomfields
AT zhouzhanfan sblcahybridmodelfordiseasenamedentityrecognitionbasedonsemanticbidirectionallstmsandconditionalrandomfields
AT gongtao sblcahybridmodelfordiseasenamedentityrecognitionbasedonsemanticbidirectionallstmsandconditionalrandomfields
AT haotianyong sblcahybridmodelfordiseasenamedentityrecognitionbasedonsemanticbidirectionallstmsandconditionalrandomfields
AT liuwenyin sblcahybridmodelfordiseasenamedentityrecognitionbasedonsemanticbidirectionallstmsandconditionalrandomfields