Cargando…

Single Model for Organic and Inorganic Chemical Named Entity Recognition in ChemDataExtractor

[Image: see text] Chemical Named Entity Recognition (NER) forms the basis of information extraction tasks in the chemical domain. However, while such tasks can involve multiple domains of chemistry at the same time, currently available named entity recognizers are specialized in one part of chemistr...

Descripción completa

Detalles Bibliográficos
Autores principales: Isazawa, Taketomo, Cole, Jacqueline M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2022
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9049593/
https://www.ncbi.nlm.nih.gov/pubmed/35199519
http://dx.doi.org/10.1021/acs.jcim.1c01199
_version_ 1784696173144047616
author Isazawa, Taketomo
Cole, Jacqueline M.
author_facet Isazawa, Taketomo
Cole, Jacqueline M.
author_sort Isazawa, Taketomo
collection PubMed
description [Image: see text] Chemical Named Entity Recognition (NER) forms the basis of information extraction tasks in the chemical domain. However, while such tasks can involve multiple domains of chemistry at the same time, currently available named entity recognizers are specialized in one part of chemistry, resulting in such workflows failing for a biased subset of mentions. This paper presents a single model that performs at close to the state-of-the-art for both organic (CHEMDNER, 89.7 F1 score) and inorganic (Matscholar, 88.0 F1 score) NER tasks at the same time. Our NER system utilizing the Bert architecture is available as part of ChemDataExtractor 2.1, along with the data sets and scripts used to train the model.
format Online
Article
Text
id pubmed-9049593
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-90495932022-04-29 Single Model for Organic and Inorganic Chemical Named Entity Recognition in ChemDataExtractor Isazawa, Taketomo Cole, Jacqueline M. J Chem Inf Model [Image: see text] Chemical Named Entity Recognition (NER) forms the basis of information extraction tasks in the chemical domain. However, while such tasks can involve multiple domains of chemistry at the same time, currently available named entity recognizers are specialized in one part of chemistry, resulting in such workflows failing for a biased subset of mentions. This paper presents a single model that performs at close to the state-of-the-art for both organic (CHEMDNER, 89.7 F1 score) and inorganic (Matscholar, 88.0 F1 score) NER tasks at the same time. Our NER system utilizing the Bert architecture is available as part of ChemDataExtractor 2.1, along with the data sets and scripts used to train the model. American Chemical Society 2022-02-24 2022-03-14 /pmc/articles/PMC9049593/ /pubmed/35199519 http://dx.doi.org/10.1021/acs.jcim.1c01199 Text en © 2022 American Chemical Society https://creativecommons.org/licenses/by/4.0/Permits the broadest form of re-use including for commercial purposes, provided that author attribution and integrity are maintained (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Isazawa, Taketomo
Cole, Jacqueline M.
Single Model for Organic and Inorganic Chemical Named Entity Recognition in ChemDataExtractor
title Single Model for Organic and Inorganic Chemical Named Entity Recognition in ChemDataExtractor
title_full Single Model for Organic and Inorganic Chemical Named Entity Recognition in ChemDataExtractor
title_fullStr Single Model for Organic and Inorganic Chemical Named Entity Recognition in ChemDataExtractor
title_full_unstemmed Single Model for Organic and Inorganic Chemical Named Entity Recognition in ChemDataExtractor
title_short Single Model for Organic and Inorganic Chemical Named Entity Recognition in ChemDataExtractor
title_sort single model for organic and inorganic chemical named entity recognition in chemdataextractor
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9049593/
https://www.ncbi.nlm.nih.gov/pubmed/35199519
http://dx.doi.org/10.1021/acs.jcim.1c01199
work_keys_str_mv AT isazawataketomo singlemodelfororganicandinorganicchemicalnamedentityrecognitioninchemdataextractor
AT colejacquelinem singlemodelfororganicandinorganicchemicalnamedentityrecognitioninchemdataextractor