Cargando…

Geolocation of multiple sociolinguistic markers in Buenos Aires

Analysis of language geography is increasingly being used for studying spatial patterns of social dynamics. This trend is fueled by social media platforms such as Twitter which provide access to large amounts of natural language data combined with geolocation and user metadata enabling reconstructio...

Descripción completa

Detalles Bibliográficos
Autores principales: Kellert, Olga, Matlis, Nicholas H.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9462814/
https://www.ncbi.nlm.nih.gov/pubmed/36084118
http://dx.doi.org/10.1371/journal.pone.0274114
Descripción
Sumario:Analysis of language geography is increasingly being used for studying spatial patterns of social dynamics. This trend is fueled by social media platforms such as Twitter which provide access to large amounts of natural language data combined with geolocation and user metadata enabling reconstruction of detailed spatial patterns of language use. Most studies are performed on large spatial scales associated with countries and regions, where language dynamics are often dominated by the effects of geographic and administrative borders. Extending to smaller, urban scales, however, allows visualization of spatial patterns of language use determined by social dynamics within the city, providing valuable information for a range of social topics from demographic studies to urban planning. So far, few studies have been made in this domain, due, in part, to the challenges in developing algorithms that accurately classify linguistic features. Here we extend urban-scale geographical analysis of language use beyond lexical meaning to include other sociolinguistic markers that identify language style, dialect and social groups. Some features, which have not been explored with social-media data on the urban scale, can be used to target a range of social phenomena. Our study focuses on Twitter use in Buenos Aires and our approach classifies tweets based on contrasting sets of tokens manually selected to target precise linguistic features. We perform statistical analyses of eleven categories of language use to quantify the presence of spatial patterns and the extent to which they are socially driven. We then perform the first comparative analysis assessing how the patterns and strength of social drivers vary with category. Finally, we derive plausible explanations for the patterns by comparing them with independently generated maps of geosocial context. Identifying these connections is a key aspect of the social-dynamics analysis which has so far received insufficient attention.