Cargando…

LitCovid-AGAC: cellular and molecular level annotation data set based on COVID-19

Currently, coronavirus disease 2019 (COVID-19) literature has been increasing dramatically, and the increased text amount make it possible to perform large scale text mining and knowledge discovery. Therefore, curation of these texts becomes a crucial issue for Bio-medical Natural Language Processin...

Descripción completa

Detalles Bibliográficos
Autores principales: Ouyang, Sizhuo, Wang, Yuxing, Zhou, Kaiyin, Xia, Jingbo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Korea Genome Organization 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8510875/
https://www.ncbi.nlm.nih.gov/pubmed/34638170
http://dx.doi.org/10.5808/gi.21013
_version_ 1784582665890955264
author Ouyang, Sizhuo
Wang, Yuxing
Zhou, Kaiyin
Xia, Jingbo
author_facet Ouyang, Sizhuo
Wang, Yuxing
Zhou, Kaiyin
Xia, Jingbo
author_sort Ouyang, Sizhuo
collection PubMed
description Currently, coronavirus disease 2019 (COVID-19) literature has been increasing dramatically, and the increased text amount make it possible to perform large scale text mining and knowledge discovery. Therefore, curation of these texts becomes a crucial issue for Bio-medical Natural Language Processing (BioNLP) community, so as to retrieve the important information about the mechanism of COVID-19. PubAnnotation is an aligned annotation system which provides an efficient platform for biological curators to upload their annotations or merge other external annotations. Inspired by the integration among multiple useful COVID-19 annotations, we merged three annotations resources to LitCovid data set, and constructed a cross-annotated corpus, LitCovid-AGAC. This corpus consists of 12 labels including Mutation, Species, Gene, Disease from PubTator, GO, CHEBI from OGER, Var, MPA, CPA, NegReg, PosReg, Reg from AGAC, upon 50,018 COVID-19 abstracts in LitCovid. Contain sufficient abundant information being possible to unveil the hidden knowledge in the pathological mechanism of COVID-19.
format Online
Article
Text
id pubmed-8510875
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Korea Genome Organization
record_format MEDLINE/PubMed
spelling pubmed-85108752021-10-22 LitCovid-AGAC: cellular and molecular level annotation data set based on COVID-19 Ouyang, Sizhuo Wang, Yuxing Zhou, Kaiyin Xia, Jingbo Genomics Inform Blah7 Currently, coronavirus disease 2019 (COVID-19) literature has been increasing dramatically, and the increased text amount make it possible to perform large scale text mining and knowledge discovery. Therefore, curation of these texts becomes a crucial issue for Bio-medical Natural Language Processing (BioNLP) community, so as to retrieve the important information about the mechanism of COVID-19. PubAnnotation is an aligned annotation system which provides an efficient platform for biological curators to upload their annotations or merge other external annotations. Inspired by the integration among multiple useful COVID-19 annotations, we merged three annotations resources to LitCovid data set, and constructed a cross-annotated corpus, LitCovid-AGAC. This corpus consists of 12 labels including Mutation, Species, Gene, Disease from PubTator, GO, CHEBI from OGER, Var, MPA, CPA, NegReg, PosReg, Reg from AGAC, upon 50,018 COVID-19 abstracts in LitCovid. Contain sufficient abundant information being possible to unveil the hidden knowledge in the pathological mechanism of COVID-19. Korea Genome Organization 2021-09-30 /pmc/articles/PMC8510875/ /pubmed/34638170 http://dx.doi.org/10.5808/gi.21013 Text en (c) 2021, Korea Genome Organization https://creativecommons.org/licenses/by/4.0/(CC) This is an open-access article distributed under the terms of the Creative Commons Attribution license(https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Blah7
Ouyang, Sizhuo
Wang, Yuxing
Zhou, Kaiyin
Xia, Jingbo
LitCovid-AGAC: cellular and molecular level annotation data set based on COVID-19
title LitCovid-AGAC: cellular and molecular level annotation data set based on COVID-19
title_full LitCovid-AGAC: cellular and molecular level annotation data set based on COVID-19
title_fullStr LitCovid-AGAC: cellular and molecular level annotation data set based on COVID-19
title_full_unstemmed LitCovid-AGAC: cellular and molecular level annotation data set based on COVID-19
title_short LitCovid-AGAC: cellular and molecular level annotation data set based on COVID-19
title_sort litcovid-agac: cellular and molecular level annotation data set based on covid-19
topic Blah7
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8510875/
https://www.ncbi.nlm.nih.gov/pubmed/34638170
http://dx.doi.org/10.5808/gi.21013
work_keys_str_mv AT ouyangsizhuo litcovidagaccellularandmolecularlevelannotationdatasetbasedoncovid19
AT wangyuxing litcovidagaccellularandmolecularlevelannotationdatasetbasedoncovid19
AT zhoukaiyin litcovidagaccellularandmolecularlevelannotationdatasetbasedoncovid19
AT xiajingbo litcovidagaccellularandmolecularlevelannotationdatasetbasedoncovid19