Cargando…
An Infinite Mixture Model for Coreference Resolution in Clinical Notes
It is widely acknowledged that natural language processing is indispensable to process electronic health records (EHRs). However, poor performance in relation detection tasks, such as coreference (linguistic expressions pertaining to the same entity/event) may affect the quality of EHR processing. H...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Medical Informatics Association
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5009297/ https://www.ncbi.nlm.nih.gov/pubmed/27595047 |
Sumario: | It is widely acknowledged that natural language processing is indispensable to process electronic health records (EHRs). However, poor performance in relation detection tasks, such as coreference (linguistic expressions pertaining to the same entity/event) may affect the quality of EHR processing. Hence, there is a critical need to advance the research for relation detection from EHRs. Most of the clinical coreference resolution systems are based on either supervised machine learning or rule-based methods. The need for manually annotated corpus hampers the use of such system in large scale. In this paper, we present an infinite mixture model method using definite sampling to resolve coreferent relations among mentions in clinical notes. A similarity measure function is proposed to determine the coreferent relations. Our system achieved a 0.847 F-measure for i2b2 2011 coreference corpus. This promising results and the unsupervised nature make it possible to apply the system in big-data clinical setting. |
---|