Cargando…
NLM-Chem, a new resource for chemical entity recognition in PubMed full text literature
Automatically identifying chemical and drug names in scientific publications advances information access for this important class of entities in a variety of biomedical disciplines by enabling improved retrieval and linkage to related concepts. While current methods for tagging chemical entities wer...
Autores principales: | , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7994842/ https://www.ncbi.nlm.nih.gov/pubmed/33767203 http://dx.doi.org/10.1038/s41597-021-00875-1 |
_version_ | 1783669840740876288 |
---|---|
author | Islamaj, Rezarta Leaman, Robert Kim, Sun Kwon, Dongseop Wei, Chih-Hsuan Comeau, Donald C. Peng, Yifan Cissel, David Coss, Cathleen Fisher, Carol Guzman, Rob Kochar, Preeti Gokal Koppel, Stella Trinh, Dorothy Sekiya, Keiko Ward, Janice Whitman, Deborah Schmidt, Susan Lu, Zhiyong |
author_facet | Islamaj, Rezarta Leaman, Robert Kim, Sun Kwon, Dongseop Wei, Chih-Hsuan Comeau, Donald C. Peng, Yifan Cissel, David Coss, Cathleen Fisher, Carol Guzman, Rob Kochar, Preeti Gokal Koppel, Stella Trinh, Dorothy Sekiya, Keiko Ward, Janice Whitman, Deborah Schmidt, Susan Lu, Zhiyong |
author_sort | Islamaj, Rezarta |
collection | PubMed |
description | Automatically identifying chemical and drug names in scientific publications advances information access for this important class of entities in a variety of biomedical disciplines by enabling improved retrieval and linkage to related concepts. While current methods for tagging chemical entities were developed for the article title and abstract, their performance in the full article text is substantially lower. However, the full text frequently contains more detailed chemical information, such as the properties of chemical compounds, their biological effects and interactions with diseases, genes and other chemicals. We therefore present the NLM-Chem corpus, a full-text resource to support the development and evaluation of automated chemical entity taggers. The NLM-Chem corpus consists of 150 full-text articles, doubly annotated by ten expert NLM indexers, with ~5000 unique chemical name annotations, mapped to ~2000 MeSH identifiers. We also describe a substantially improved chemical entity tagger, with automated annotations for all of PubMed and PMC freely accessible through the PubTator web-based interface and API. The NLM-Chem corpus is freely available. |
format | Online Article Text |
id | pubmed-7994842 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-79948422021-04-16 NLM-Chem, a new resource for chemical entity recognition in PubMed full text literature Islamaj, Rezarta Leaman, Robert Kim, Sun Kwon, Dongseop Wei, Chih-Hsuan Comeau, Donald C. Peng, Yifan Cissel, David Coss, Cathleen Fisher, Carol Guzman, Rob Kochar, Preeti Gokal Koppel, Stella Trinh, Dorothy Sekiya, Keiko Ward, Janice Whitman, Deborah Schmidt, Susan Lu, Zhiyong Sci Data Data Descriptor Automatically identifying chemical and drug names in scientific publications advances information access for this important class of entities in a variety of biomedical disciplines by enabling improved retrieval and linkage to related concepts. While current methods for tagging chemical entities were developed for the article title and abstract, their performance in the full article text is substantially lower. However, the full text frequently contains more detailed chemical information, such as the properties of chemical compounds, their biological effects and interactions with diseases, genes and other chemicals. We therefore present the NLM-Chem corpus, a full-text resource to support the development and evaluation of automated chemical entity taggers. The NLM-Chem corpus consists of 150 full-text articles, doubly annotated by ten expert NLM indexers, with ~5000 unique chemical name annotations, mapped to ~2000 MeSH identifiers. We also describe a substantially improved chemical entity tagger, with automated annotations for all of PubMed and PMC freely accessible through the PubTator web-based interface and API. The NLM-Chem corpus is freely available. Nature Publishing Group UK 2021-03-25 /pmc/articles/PMC7994842/ /pubmed/33767203 http://dx.doi.org/10.1038/s41597-021-00875-1 Text en © This is a U.S. Government work and not under copyright protection in the US; foreign copyright protection may apply 2021 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files associated with this article. |
spellingShingle | Data Descriptor Islamaj, Rezarta Leaman, Robert Kim, Sun Kwon, Dongseop Wei, Chih-Hsuan Comeau, Donald C. Peng, Yifan Cissel, David Coss, Cathleen Fisher, Carol Guzman, Rob Kochar, Preeti Gokal Koppel, Stella Trinh, Dorothy Sekiya, Keiko Ward, Janice Whitman, Deborah Schmidt, Susan Lu, Zhiyong NLM-Chem, a new resource for chemical entity recognition in PubMed full text literature |
title | NLM-Chem, a new resource for chemical entity recognition in PubMed full text literature |
title_full | NLM-Chem, a new resource for chemical entity recognition in PubMed full text literature |
title_fullStr | NLM-Chem, a new resource for chemical entity recognition in PubMed full text literature |
title_full_unstemmed | NLM-Chem, a new resource for chemical entity recognition in PubMed full text literature |
title_short | NLM-Chem, a new resource for chemical entity recognition in PubMed full text literature |
title_sort | nlm-chem, a new resource for chemical entity recognition in pubmed full text literature |
topic | Data Descriptor |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7994842/ https://www.ncbi.nlm.nih.gov/pubmed/33767203 http://dx.doi.org/10.1038/s41597-021-00875-1 |
work_keys_str_mv | AT islamajrezarta nlmchemanewresourceforchemicalentityrecognitioninpubmedfulltextliterature AT leamanrobert nlmchemanewresourceforchemicalentityrecognitioninpubmedfulltextliterature AT kimsun nlmchemanewresourceforchemicalentityrecognitioninpubmedfulltextliterature AT kwondongseop nlmchemanewresourceforchemicalentityrecognitioninpubmedfulltextliterature AT weichihhsuan nlmchemanewresourceforchemicalentityrecognitioninpubmedfulltextliterature AT comeaudonaldc nlmchemanewresourceforchemicalentityrecognitioninpubmedfulltextliterature AT pengyifan nlmchemanewresourceforchemicalentityrecognitioninpubmedfulltextliterature AT cisseldavid nlmchemanewresourceforchemicalentityrecognitioninpubmedfulltextliterature AT cosscathleen nlmchemanewresourceforchemicalentityrecognitioninpubmedfulltextliterature AT fishercarol nlmchemanewresourceforchemicalentityrecognitioninpubmedfulltextliterature AT guzmanrob nlmchemanewresourceforchemicalentityrecognitioninpubmedfulltextliterature AT kocharpreetigokal nlmchemanewresourceforchemicalentityrecognitioninpubmedfulltextliterature AT koppelstella nlmchemanewresourceforchemicalentityrecognitioninpubmedfulltextliterature AT trinhdorothy nlmchemanewresourceforchemicalentityrecognitioninpubmedfulltextliterature AT sekiyakeiko nlmchemanewresourceforchemicalentityrecognitioninpubmedfulltextliterature AT wardjanice nlmchemanewresourceforchemicalentityrecognitioninpubmedfulltextliterature AT whitmandeborah nlmchemanewresourceforchemicalentityrecognitioninpubmedfulltextliterature AT schmidtsusan nlmchemanewresourceforchemicalentityrecognitioninpubmedfulltextliterature AT luzhiyong nlmchemanewresourceforchemicalentityrecognitioninpubmedfulltextliterature |