Cargando…
Using Twitter to collect a multi-dialectal corpus of Albanian using advanced geotagging and dialect modeling
In this study, we present the acquisition and categorization of a geographically-informed, multi-dialectal Albanian National Corpus, derived from Twitter data. The primary dialects from three distinct regions—Albania, Kosovo, and North Macedonia—are considered. The assembled publicly available datas...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10681245/ https://www.ncbi.nlm.nih.gov/pubmed/38011168 http://dx.doi.org/10.1371/journal.pone.0294284 |