Cargando…
A hybrid approach for named entity recognition in Chinese electronic medical record
BACKGROUND: With the rapid spread of electronic medical records and the arrival of medical big data era, the application of natural language processing technology in biomedicine has become a hot research topic. METHODS: In this paper, firstly, BiLSTM-CRF model is applied to medical named entity reco...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6454595/ https://www.ncbi.nlm.nih.gov/pubmed/30961597 http://dx.doi.org/10.1186/s12911-019-0767-2 |
Sumario: | BACKGROUND: With the rapid spread of electronic medical records and the arrival of medical big data era, the application of natural language processing technology in biomedicine has become a hot research topic. METHODS: In this paper, firstly, BiLSTM-CRF model is applied to medical named entity recognition on Chinese electronic medical record. According to the characteristics of Chinese electronic medical records, obtain the low-dimensional word vector of each word in units of sentences. And then input the word vector to BiLSTM to realize automatic extraction of sentence features. And then CRF performs sentence-level word tagging. Secondly, attention mechanism is added between the BiLSTM and the CRF to construct Attention-BiLSTM-CRF model, which can leverage document-level information to alleviate tagging inconsistency. In addition, this paper proposes an entity auto-correct algorithm to rectify entities according to historical entity information. At last, a drug dictionary and post-processing rules are well-built to rectify entities, to further improve performance. RESULTS: The final F1 scores of the BiLSTM-CRF and Attention-BiLSTM-CRF model on given test dataset are 90.15 and 90.82% respectively, both of which are higher than 89.26%, which is the best F1 score on the test dataset except ours. CONCLUSION: Our approach can be used to recognize medical named entity on Chinese electronic medical records and achieves the state-of-the-art performance on the given test dataset. |
---|