Cargando…

Knowledge-Driven Event Extraction in Russian: Corpus-Based Linguistic Resources

Automatic event extraction form text is an important step in knowledge acquisition and knowledge base population. Manual work in development of extraction system is indispensable either in corpus annotation or in vocabularies and pattern creation for a knowledge-based system. Recent works have been...

Descripción completa

Detalles Bibliográficos
Autores principales: Solovyev, Valery, Ivanov, Vladimir
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi Publishing Corporation 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4756580/
https://www.ncbi.nlm.nih.gov/pubmed/26955386
http://dx.doi.org/10.1155/2016/4183760
_version_ 1782416362979721216
author Solovyev, Valery
Ivanov, Vladimir
author_facet Solovyev, Valery
Ivanov, Vladimir
author_sort Solovyev, Valery
collection PubMed
description Automatic event extraction form text is an important step in knowledge acquisition and knowledge base population. Manual work in development of extraction system is indispensable either in corpus annotation or in vocabularies and pattern creation for a knowledge-based system. Recent works have been focused on adaptation of existing system (for extraction from English texts) to new domains. Event extraction in other languages was not studied due to the lack of resources and algorithms necessary for natural language processing. In this paper we define a set of linguistic resources that are necessary in development of a knowledge-based event extraction system in Russian: a vocabulary of subordination models, a vocabulary of event triggers, and a vocabulary of Frame Elements that are basic building blocks for semantic patterns. We propose a set of methods for creation of such vocabularies in Russian and other languages using Google Books NGram Corpus. The methods are evaluated in development of event extraction system for Russian.
format Online
Article
Text
id pubmed-4756580
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Hindawi Publishing Corporation
record_format MEDLINE/PubMed
spelling pubmed-47565802016-03-07 Knowledge-Driven Event Extraction in Russian: Corpus-Based Linguistic Resources Solovyev, Valery Ivanov, Vladimir Comput Intell Neurosci Research Article Automatic event extraction form text is an important step in knowledge acquisition and knowledge base population. Manual work in development of extraction system is indispensable either in corpus annotation or in vocabularies and pattern creation for a knowledge-based system. Recent works have been focused on adaptation of existing system (for extraction from English texts) to new domains. Event extraction in other languages was not studied due to the lack of resources and algorithms necessary for natural language processing. In this paper we define a set of linguistic resources that are necessary in development of a knowledge-based event extraction system in Russian: a vocabulary of subordination models, a vocabulary of event triggers, and a vocabulary of Frame Elements that are basic building blocks for semantic patterns. We propose a set of methods for creation of such vocabularies in Russian and other languages using Google Books NGram Corpus. The methods are evaluated in development of event extraction system for Russian. Hindawi Publishing Corporation 2016 2016-01-05 /pmc/articles/PMC4756580/ /pubmed/26955386 http://dx.doi.org/10.1155/2016/4183760 Text en Copyright © 2016 V. Solovyev and V. Ivanov. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Solovyev, Valery
Ivanov, Vladimir
Knowledge-Driven Event Extraction in Russian: Corpus-Based Linguistic Resources
title Knowledge-Driven Event Extraction in Russian: Corpus-Based Linguistic Resources
title_full Knowledge-Driven Event Extraction in Russian: Corpus-Based Linguistic Resources
title_fullStr Knowledge-Driven Event Extraction in Russian: Corpus-Based Linguistic Resources
title_full_unstemmed Knowledge-Driven Event Extraction in Russian: Corpus-Based Linguistic Resources
title_short Knowledge-Driven Event Extraction in Russian: Corpus-Based Linguistic Resources
title_sort knowledge-driven event extraction in russian: corpus-based linguistic resources
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4756580/
https://www.ncbi.nlm.nih.gov/pubmed/26955386
http://dx.doi.org/10.1155/2016/4183760
work_keys_str_mv AT solovyevvalery knowledgedriveneventextractioninrussiancorpusbasedlinguisticresources
AT ivanovvladimir knowledgedriveneventextractioninrussiancorpusbasedlinguisticresources