Cargando…

Single Model for Organic and Inorganic Chemical Named Entity Recognition in ChemDataExtractor

[Image: see text] Chemical Named Entity Recognition (NER) forms the basis of information extraction tasks in the chemical domain. However, while such tasks can involve multiple domains of chemistry at the same time, currently available named entity recognizers are specialized in one part of chemistr...

Descripción completa

Detalles Bibliográficos
Autores principales: Isazawa, Taketomo, Cole, Jacqueline M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2022
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9049593/
https://www.ncbi.nlm.nih.gov/pubmed/35199519
http://dx.doi.org/10.1021/acs.jcim.1c01199
Descripción
Sumario:[Image: see text] Chemical Named Entity Recognition (NER) forms the basis of information extraction tasks in the chemical domain. However, while such tasks can involve multiple domains of chemistry at the same time, currently available named entity recognizers are specialized in one part of chemistry, resulting in such workflows failing for a biased subset of mentions. This paper presents a single model that performs at close to the state-of-the-art for both organic (CHEMDNER, 89.7 F1 score) and inorganic (Matscholar, 88.0 F1 score) NER tasks at the same time. Our NER system utilizing the Bert architecture is available as part of ChemDataExtractor 2.1, along with the data sets and scripts used to train the model.