Cargando…
Dataset of Karakalpak language stop words
The dataset presented in this paper aims to address the challenge of automatic extraction of stop words in Natural Language Processing (NLP) for the low-resource Karakalpak language spoken by approximately two million people in Uzbekistan. To accomplish this, we have created a corpus of 23 Karakalpa...
Autores principales: | Madatov, Khabibulla, Bekchanov, Shukurla, Vičič, Jernej |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10126844/ https://www.ncbi.nlm.nih.gov/pubmed/37113499 http://dx.doi.org/10.1016/j.dib.2023.109111 |
Ejemplares similares
-
Dataset of stopwords extracted from Uzbek texts
por: Madatov, Khabibulla, et al.
Publicado: (2022) -
Enhancing text pre-processing for Swahili language: Datasets for common Swahili stop-words, slangs and typos with equivalent proper words
por: Masua, Bernard, et al.
Publicado: (2020) -
MyWSL: Malaysian words sign language dataset
por: Johari, Rina Tasia, et al.
Publicado: (2023) -
BdSLW-11: Dataset of Bangladeshi sign language words for recognizing 11 daily useful BdSL words
por: Islam, Md. Monirul, et al.
Publicado: (2022) -
Data about fall events and ordinary daily activities from a sensorized smart floor
por: Tošić, Aleksandar, et al.
Publicado: (2021)