Cargando…

Context-aware multi-token concept recognition of biological entities

BACKGROUND: Concept recognition is a term that corresponds to the two sequential steps of named entity recognition and named entity normalization, and plays an essential role in the field of bioinformatics. However, the conventional dictionary-based methods did not sufficiently addressed the variati...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, Kwangmin, Lee, Doheon
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8529713/
https://www.ncbi.nlm.nih.gov/pubmed/34674631
http://dx.doi.org/10.1186/s12859-021-04248-8
Descripción
Sumario:BACKGROUND: Concept recognition is a term that corresponds to the two sequential steps of named entity recognition and named entity normalization, and plays an essential role in the field of bioinformatics. However, the conventional dictionary-based methods did not sufficiently addressed the variation of the concepts in actual use in literature, resulting in the particularly degraded performances in recognition of multi-token concepts. RESULTS: In this paper, we propose a concept recognition method of multi-token biological entities using neural models combined with literature contexts. The key aspect of our method is utilizing the contextual information from the biological knowledge-bases for concept normalization, which is followed by named entity recognition procedure. The model showed improved performances over conventional methods, particularly for multi-token concepts with higher variations. CONCLUSIONS: We expect that our model can be utilized for effective concept recognition and variety of natural language processing tasks on bioinformatics.