Cargando…
GNI Corpus Version 1.0: Annotated Full-Text Corpus of Genomics & Informatics to Support Biomedical Information Extraction
Genomics & Informatics (NLM title abbreviation: Genomics Inform) is the official journal of the Korea Genome Organization. Text corpus for this journal annotated with various levels of linguistic information would be a valuable resource as the process of information extraction requires syntactic...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Society of Gastrointestinal Intervention
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6187819/ https://www.ncbi.nlm.nih.gov/pubmed/30309207 http://dx.doi.org/10.5808/GI.2018.16.3.75 |
_version_ | 1783363093431058432 |
---|---|
author | Oh, So-Yeon Kim, Ji-Hyeon Kim, Seo-Jin Nam, Hee-Jo Park, Hyun-Seok |
author_facet | Oh, So-Yeon Kim, Ji-Hyeon Kim, Seo-Jin Nam, Hee-Jo Park, Hyun-Seok |
author_sort | Oh, So-Yeon |
collection | PubMed |
description | Genomics & Informatics (NLM title abbreviation: Genomics Inform) is the official journal of the Korea Genome Organization. Text corpus for this journal annotated with various levels of linguistic information would be a valuable resource as the process of information extraction requires syntactic, semantic, and higher levels of natural language processing. In this study, we publish our new corpus called GNI Corpus version 1.0, extracted and annotated from full texts of Genomics & Informatics, with NLTK (Natural Language ToolKit)-based text mining script. The preliminary version of the corpus could be used as a training and testing set of a system that serves a variety of functions for future biomedical text mining. |
format | Online Article Text |
id | pubmed-6187819 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Society of Gastrointestinal Intervention |
record_format | MEDLINE/PubMed |
spelling | pubmed-61878192018-10-17 GNI Corpus Version 1.0: Annotated Full-Text Corpus of Genomics & Informatics to Support Biomedical Information Extraction Oh, So-Yeon Kim, Ji-Hyeon Kim, Seo-Jin Nam, Hee-Jo Park, Hyun-Seok Genomics Inform Application Note Genomics & Informatics (NLM title abbreviation: Genomics Inform) is the official journal of the Korea Genome Organization. Text corpus for this journal annotated with various levels of linguistic information would be a valuable resource as the process of information extraction requires syntactic, semantic, and higher levels of natural language processing. In this study, we publish our new corpus called GNI Corpus version 1.0, extracted and annotated from full texts of Genomics & Informatics, with NLTK (Natural Language ToolKit)-based text mining script. The preliminary version of the corpus could be used as a training and testing set of a system that serves a variety of functions for future biomedical text mining. Society of Gastrointestinal Intervention 2018-09 2018-09-30 /pmc/articles/PMC6187819/ /pubmed/30309207 http://dx.doi.org/10.5808/GI.2018.16.3.75 Text en Copyright © 2018 by the Korea Genome Organization It is identical to the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/). |
spellingShingle | Application Note Oh, So-Yeon Kim, Ji-Hyeon Kim, Seo-Jin Nam, Hee-Jo Park, Hyun-Seok GNI Corpus Version 1.0: Annotated Full-Text Corpus of Genomics & Informatics to Support Biomedical Information Extraction |
title | GNI Corpus Version 1.0: Annotated Full-Text Corpus of Genomics & Informatics to Support Biomedical Information Extraction |
title_full | GNI Corpus Version 1.0: Annotated Full-Text Corpus of Genomics & Informatics to Support Biomedical Information Extraction |
title_fullStr | GNI Corpus Version 1.0: Annotated Full-Text Corpus of Genomics & Informatics to Support Biomedical Information Extraction |
title_full_unstemmed | GNI Corpus Version 1.0: Annotated Full-Text Corpus of Genomics & Informatics to Support Biomedical Information Extraction |
title_short | GNI Corpus Version 1.0: Annotated Full-Text Corpus of Genomics & Informatics to Support Biomedical Information Extraction |
title_sort | gni corpus version 1.0: annotated full-text corpus of genomics & informatics to support biomedical information extraction |
topic | Application Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6187819/ https://www.ncbi.nlm.nih.gov/pubmed/30309207 http://dx.doi.org/10.5808/GI.2018.16.3.75 |
work_keys_str_mv | AT ohsoyeon gnicorpusversion10annotatedfulltextcorpusofgenomicsinformaticstosupportbiomedicalinformationextraction AT kimjihyeon gnicorpusversion10annotatedfulltextcorpusofgenomicsinformaticstosupportbiomedicalinformationextraction AT kimseojin gnicorpusversion10annotatedfulltextcorpusofgenomicsinformaticstosupportbiomedicalinformationextraction AT namheejo gnicorpusversion10annotatedfulltextcorpusofgenomicsinformaticstosupportbiomedicalinformationextraction AT parkhyunseok gnicorpusversion10annotatedfulltextcorpusofgenomicsinformaticstosupportbiomedicalinformationextraction |