Cargando…
Using FLOSS for Storing, Processing and Linking Corpus Data
Corpus data is widely used to solve different linguistic, educational and applied problems. The Tatar corpus management system (http://tugantel.tatar) is specifically developed for Turkic languages. The functionality of our corpus management system includes a search of lexical units, morphological a...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7198243/ http://dx.doi.org/10.1007/978-3-030-47240-5_17 |
Sumario: | Corpus data is widely used to solve different linguistic, educational and applied problems. The Tatar corpus management system (http://tugantel.tatar) is specifically developed for Turkic languages. The functionality of our corpus management system includes a search of lexical units, morphological and lexical search, a search of syntactic units, a search of N-grams and others. The search is performed using open source tools (database management system MariaDB, Redis data store). This article describes the process of choosing FLOSS for the main components of our system and also processing a search query and building a linked open dataset based on corpus data. |
---|