Cargando…
Using the PubAnnotation ecosystem to perform agile text mining on Genomics & Informatics: a tutorial review
The prototype version of the full-text corpus of Genomics & Informatics has recently been archived in a GitHub repository. The full-text publications of volumes 10 through 17 are also directly downloadable from PubMed Central (PMC) as XML files. During the Biomedical Linked Annotation Hackathon...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Korea Genome Organization
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7362947/ https://www.ncbi.nlm.nih.gov/pubmed/32634867 http://dx.doi.org/10.5808/GI.2020.18.2.e13 |
_version_ | 1783559585106231296 |
---|---|
author | Nam, Hee-Jo Yamada, Ryota Park, Hyun-Seok |
author_facet | Nam, Hee-Jo Yamada, Ryota Park, Hyun-Seok |
author_sort | Nam, Hee-Jo |
collection | PubMed |
description | The prototype version of the full-text corpus of Genomics & Informatics has recently been archived in a GitHub repository. The full-text publications of volumes 10 through 17 are also directly downloadable from PubMed Central (PMC) as XML files. During the Biomedical Linked Annotation Hackathon 6 (BLAH6), we experimented with converting, annotating, and updating 301 PMC full-text articles of Genomics & Informatics using PubAnnotation, a system that provides a convenient way to add PMC publications based on PMCID. Thus, this review aims to provide a tutorial overview of practicing the iterative task of named entity recognition with the PubAnnotation/PubDictionaries/TextAE ecosystem. We also describe developing a conversion tool between the Genia tagger output and the JSON format of PubAnnotation during the hackathon. |
format | Online Article Text |
id | pubmed-7362947 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Korea Genome Organization |
record_format | MEDLINE/PubMed |
spelling | pubmed-73629472020-07-23 Using the PubAnnotation ecosystem to perform agile text mining on Genomics & Informatics: a tutorial review Nam, Hee-Jo Yamada, Ryota Park, Hyun-Seok Genomics Inform Review Article The prototype version of the full-text corpus of Genomics & Informatics has recently been archived in a GitHub repository. The full-text publications of volumes 10 through 17 are also directly downloadable from PubMed Central (PMC) as XML files. During the Biomedical Linked Annotation Hackathon 6 (BLAH6), we experimented with converting, annotating, and updating 301 PMC full-text articles of Genomics & Informatics using PubAnnotation, a system that provides a convenient way to add PMC publications based on PMCID. Thus, this review aims to provide a tutorial overview of practicing the iterative task of named entity recognition with the PubAnnotation/PubDictionaries/TextAE ecosystem. We also describe developing a conversion tool between the Genia tagger output and the JSON format of PubAnnotation during the hackathon. Korea Genome Organization 2020-06-16 /pmc/articles/PMC7362947/ /pubmed/32634867 http://dx.doi.org/10.5808/GI.2020.18.2.e13 Text en (c) 2020, Korea Genome Organization (CC) This is an open-access article distributed under the terms of the Creative Commons Attribution license(https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Review Article Nam, Hee-Jo Yamada, Ryota Park, Hyun-Seok Using the PubAnnotation ecosystem to perform agile text mining on Genomics & Informatics: a tutorial review |
title | Using the PubAnnotation ecosystem to perform agile text mining on Genomics & Informatics: a tutorial review |
title_full | Using the PubAnnotation ecosystem to perform agile text mining on Genomics & Informatics: a tutorial review |
title_fullStr | Using the PubAnnotation ecosystem to perform agile text mining on Genomics & Informatics: a tutorial review |
title_full_unstemmed | Using the PubAnnotation ecosystem to perform agile text mining on Genomics & Informatics: a tutorial review |
title_short | Using the PubAnnotation ecosystem to perform agile text mining on Genomics & Informatics: a tutorial review |
title_sort | using the pubannotation ecosystem to perform agile text mining on genomics & informatics: a tutorial review |
topic | Review Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7362947/ https://www.ncbi.nlm.nih.gov/pubmed/32634867 http://dx.doi.org/10.5808/GI.2020.18.2.e13 |
work_keys_str_mv | AT namheejo usingthepubannotationecosystemtoperformagiletextminingongenomicsinformaticsatutorialreview AT yamadaryota usingthepubannotationecosystemtoperformagiletextminingongenomicsinformaticsatutorialreview AT parkhyunseok usingthepubannotationecosystemtoperformagiletextminingongenomicsinformaticsatutorialreview |