Cargando…

Clinical Named Entity Recognition from Chinese Electronic Medical Records Based on Deep Learning Pretraining

BACKGROUND: Clinical named entity recognition is the basic task of mining electronic medical records text, which are with some challenges containing the language features of Chinese electronic medical records text with many compound entities, serious missing sentence components, and unclear entity b...

Descripción completa

Detalles Bibliográficos
Autores principales: Gong, Lejun, Zhang, Zhifei, Chen, Shiqi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7707942/
https://www.ncbi.nlm.nih.gov/pubmed/33299537
http://dx.doi.org/10.1155/2020/8829219
_version_ 1783617461293154304
author Gong, Lejun
Zhang, Zhifei
Chen, Shiqi
author_facet Gong, Lejun
Zhang, Zhifei
Chen, Shiqi
author_sort Gong, Lejun
collection PubMed
description BACKGROUND: Clinical named entity recognition is the basic task of mining electronic medical records text, which are with some challenges containing the language features of Chinese electronic medical records text with many compound entities, serious missing sentence components, and unclear entity boundary. Moreover, the corpus of Chinese electronic medical records is difficult to obtain. METHODS: Aiming at these characteristics of Chinese electronic medical records, this study proposed a Chinese clinical entity recognition model based on deep learning pretraining. The model used word embedding from domain corpus and fine-tuning of entity recognition model pretrained by relevant corpus. Then BiLSTM and Transformer are, respectively, used as feature extractors to identify four types of clinical entities including diseases, symptoms, drugs, and operations from the text of Chinese electronic medical records. RESULTS: 75.06% Macro-P, 76.40% Macro-R, and 75.72% Macro-F1 aiming at test dataset could be achieved. These experiments show that the Chinese clinical entity recognition model based on deep learning pretraining can effectively improve the recognition effect. CONCLUSIONS: These experiments show that the proposed Chinese clinical entity recognition model based on deep learning pretraining can effectively improve the recognition performance.
format Online
Article
Text
id pubmed-7707942
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-77079422020-12-08 Clinical Named Entity Recognition from Chinese Electronic Medical Records Based on Deep Learning Pretraining Gong, Lejun Zhang, Zhifei Chen, Shiqi J Healthc Eng Research Article BACKGROUND: Clinical named entity recognition is the basic task of mining electronic medical records text, which are with some challenges containing the language features of Chinese electronic medical records text with many compound entities, serious missing sentence components, and unclear entity boundary. Moreover, the corpus of Chinese electronic medical records is difficult to obtain. METHODS: Aiming at these characteristics of Chinese electronic medical records, this study proposed a Chinese clinical entity recognition model based on deep learning pretraining. The model used word embedding from domain corpus and fine-tuning of entity recognition model pretrained by relevant corpus. Then BiLSTM and Transformer are, respectively, used as feature extractors to identify four types of clinical entities including diseases, symptoms, drugs, and operations from the text of Chinese electronic medical records. RESULTS: 75.06% Macro-P, 76.40% Macro-R, and 75.72% Macro-F1 aiming at test dataset could be achieved. These experiments show that the Chinese clinical entity recognition model based on deep learning pretraining can effectively improve the recognition effect. CONCLUSIONS: These experiments show that the proposed Chinese clinical entity recognition model based on deep learning pretraining can effectively improve the recognition performance. Hindawi 2020-11-24 /pmc/articles/PMC7707942/ /pubmed/33299537 http://dx.doi.org/10.1155/2020/8829219 Text en Copyright © 2020 Lejun Gong et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Gong, Lejun
Zhang, Zhifei
Chen, Shiqi
Clinical Named Entity Recognition from Chinese Electronic Medical Records Based on Deep Learning Pretraining
title Clinical Named Entity Recognition from Chinese Electronic Medical Records Based on Deep Learning Pretraining
title_full Clinical Named Entity Recognition from Chinese Electronic Medical Records Based on Deep Learning Pretraining
title_fullStr Clinical Named Entity Recognition from Chinese Electronic Medical Records Based on Deep Learning Pretraining
title_full_unstemmed Clinical Named Entity Recognition from Chinese Electronic Medical Records Based on Deep Learning Pretraining
title_short Clinical Named Entity Recognition from Chinese Electronic Medical Records Based on Deep Learning Pretraining
title_sort clinical named entity recognition from chinese electronic medical records based on deep learning pretraining
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7707942/
https://www.ncbi.nlm.nih.gov/pubmed/33299537
http://dx.doi.org/10.1155/2020/8829219
work_keys_str_mv AT gonglejun clinicalnamedentityrecognitionfromchineseelectronicmedicalrecordsbasedondeeplearningpretraining
AT zhangzhifei clinicalnamedentityrecognitionfromchineseelectronicmedicalrecordsbasedondeeplearningpretraining
AT chenshiqi clinicalnamedentityrecognitionfromchineseelectronicmedicalrecordsbasedondeeplearningpretraining