Cargando…
A multi-layer soft lattice based model for Chinese clinical named entity recognition
OBJECTIVE: Named entity recognition (NER) is a key and fundamental part of many medical and clinical tasks, including the establishment of a medical knowledge graph, decision-making support, and question answering systems. When extracting entities from electronic health records (EHRs), NER models mo...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9338545/ https://www.ncbi.nlm.nih.gov/pubmed/35908055 http://dx.doi.org/10.1186/s12911-022-01924-4 |
_version_ | 1784759992411226112 |
---|---|
author | Guo, Shuli Yang, Wentao Han, Lina Song, Xiaowei Wang, Guowei |
author_facet | Guo, Shuli Yang, Wentao Han, Lina Song, Xiaowei Wang, Guowei |
author_sort | Guo, Shuli |
collection | PubMed |
description | OBJECTIVE: Named entity recognition (NER) is a key and fundamental part of many medical and clinical tasks, including the establishment of a medical knowledge graph, decision-making support, and question answering systems. When extracting entities from electronic health records (EHRs), NER models mostly apply long short-term memory (LSTM) and have surprising performance in clinical NER. However, increasing the depth of the network is often required by these LSTM-based models to capture long-distance dependencies. Therefore, these LSTM-based models that have achieved high accuracy generally require long training times and extensive training data, which has obstructed the adoption of LSTM-based models in clinical scenarios with limited training time. METHOD: Inspired by Transformer, we combine Transformer with Soft Term Position Lattice to form soft lattice structure Transformer, which models long-distance dependencies similarly to LSTM. Our model consists of four components: the WordPiece module, the BERT module, the soft lattice structure Transformer module, and the CRF module. RESULT: Our experiments demonstrated that this approach increased the F1 by 1–5% in the CCKS NER task compared to other models based on LSTM with CRF and consumed less training time. Additional evaluations showed that lattice structure transformer shows good performance for recognizing long medical terms, abbreviations, and numbers. The proposed model achieve 91.6% f-measure in recognizing long medical terms and 90.36% f-measure in abbreviations, and numbers. CONCLUSIONS: By using soft lattice structure Transformer, the method proposed in this paper captured Chinese words to lattice information, making our model suitable for Chinese clinical medical records. Transformers with Mutilayer soft lattice Chinese word construction can capture potential interactions between Chinese characters and words. |
format | Online Article Text |
id | pubmed-9338545 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-93385452022-07-31 A multi-layer soft lattice based model for Chinese clinical named entity recognition Guo, Shuli Yang, Wentao Han, Lina Song, Xiaowei Wang, Guowei BMC Med Inform Decis Mak Research OBJECTIVE: Named entity recognition (NER) is a key and fundamental part of many medical and clinical tasks, including the establishment of a medical knowledge graph, decision-making support, and question answering systems. When extracting entities from electronic health records (EHRs), NER models mostly apply long short-term memory (LSTM) and have surprising performance in clinical NER. However, increasing the depth of the network is often required by these LSTM-based models to capture long-distance dependencies. Therefore, these LSTM-based models that have achieved high accuracy generally require long training times and extensive training data, which has obstructed the adoption of LSTM-based models in clinical scenarios with limited training time. METHOD: Inspired by Transformer, we combine Transformer with Soft Term Position Lattice to form soft lattice structure Transformer, which models long-distance dependencies similarly to LSTM. Our model consists of four components: the WordPiece module, the BERT module, the soft lattice structure Transformer module, and the CRF module. RESULT: Our experiments demonstrated that this approach increased the F1 by 1–5% in the CCKS NER task compared to other models based on LSTM with CRF and consumed less training time. Additional evaluations showed that lattice structure transformer shows good performance for recognizing long medical terms, abbreviations, and numbers. The proposed model achieve 91.6% f-measure in recognizing long medical terms and 90.36% f-measure in abbreviations, and numbers. CONCLUSIONS: By using soft lattice structure Transformer, the method proposed in this paper captured Chinese words to lattice information, making our model suitable for Chinese clinical medical records. Transformers with Mutilayer soft lattice Chinese word construction can capture potential interactions between Chinese characters and words. BioMed Central 2022-07-30 /pmc/articles/PMC9338545/ /pubmed/35908055 http://dx.doi.org/10.1186/s12911-022-01924-4 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Guo, Shuli Yang, Wentao Han, Lina Song, Xiaowei Wang, Guowei A multi-layer soft lattice based model for Chinese clinical named entity recognition |
title | A multi-layer soft lattice based model for Chinese clinical named entity recognition |
title_full | A multi-layer soft lattice based model for Chinese clinical named entity recognition |
title_fullStr | A multi-layer soft lattice based model for Chinese clinical named entity recognition |
title_full_unstemmed | A multi-layer soft lattice based model for Chinese clinical named entity recognition |
title_short | A multi-layer soft lattice based model for Chinese clinical named entity recognition |
title_sort | multi-layer soft lattice based model for chinese clinical named entity recognition |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9338545/ https://www.ncbi.nlm.nih.gov/pubmed/35908055 http://dx.doi.org/10.1186/s12911-022-01924-4 |
work_keys_str_mv | AT guoshuli amultilayersoftlatticebasedmodelforchineseclinicalnamedentityrecognition AT yangwentao amultilayersoftlatticebasedmodelforchineseclinicalnamedentityrecognition AT hanlina amultilayersoftlatticebasedmodelforchineseclinicalnamedentityrecognition AT songxiaowei amultilayersoftlatticebasedmodelforchineseclinicalnamedentityrecognition AT wangguowei amultilayersoftlatticebasedmodelforchineseclinicalnamedentityrecognition AT guoshuli multilayersoftlatticebasedmodelforchineseclinicalnamedentityrecognition AT yangwentao multilayersoftlatticebasedmodelforchineseclinicalnamedentityrecognition AT hanlina multilayersoftlatticebasedmodelforchineseclinicalnamedentityrecognition AT songxiaowei multilayersoftlatticebasedmodelforchineseclinicalnamedentityrecognition AT wangguowei multilayersoftlatticebasedmodelforchineseclinicalnamedentityrecognition |