Cargando…
Mapping Lexical Dialect Variation in British English Using Twitter
There is a growing trend in regional dialectology to analyse large corpora of social media data, but it is unclear if the results of these studies can be generalized to language as a whole. To assess the generalizability of Twitter dialect maps, this paper presents the first systematic comparison of...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7861259/ https://www.ncbi.nlm.nih.gov/pubmed/33733100 http://dx.doi.org/10.3389/frai.2019.00011 |
_version_ | 1783647047144964096 |
---|---|
author | Grieve, Jack Montgomery, Chris Nini, Andrea Murakami, Akira Guo, Diansheng |
author_facet | Grieve, Jack Montgomery, Chris Nini, Andrea Murakami, Akira Guo, Diansheng |
author_sort | Grieve, Jack |
collection | PubMed |
description | There is a growing trend in regional dialectology to analyse large corpora of social media data, but it is unclear if the results of these studies can be generalized to language as a whole. To assess the generalizability of Twitter dialect maps, this paper presents the first systematic comparison of regional lexical variation in Twitter corpora and traditional survey data. We compare the regional patterns found in 139 lexical dialect maps based on a 1.8 billion word corpus of geolocated UK Twitter data and the BBC Voices dialect survey. A spatial analysis of these 139 map pairs finds a broad alignment between these two data sources, offering evidence that both approaches to data collection allow for the same basic underlying regional patterns to be identified. We argue that these results license the use of Twitter corpora for general inquiries into regional lexical variation and change. |
format | Online Article Text |
id | pubmed-7861259 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-78612592021-03-16 Mapping Lexical Dialect Variation in British English Using Twitter Grieve, Jack Montgomery, Chris Nini, Andrea Murakami, Akira Guo, Diansheng Front Artif Intell Artificial Intelligence There is a growing trend in regional dialectology to analyse large corpora of social media data, but it is unclear if the results of these studies can be generalized to language as a whole. To assess the generalizability of Twitter dialect maps, this paper presents the first systematic comparison of regional lexical variation in Twitter corpora and traditional survey data. We compare the regional patterns found in 139 lexical dialect maps based on a 1.8 billion word corpus of geolocated UK Twitter data and the BBC Voices dialect survey. A spatial analysis of these 139 map pairs finds a broad alignment between these two data sources, offering evidence that both approaches to data collection allow for the same basic underlying regional patterns to be identified. We argue that these results license the use of Twitter corpora for general inquiries into regional lexical variation and change. Frontiers Media S.A. 2019-07-12 /pmc/articles/PMC7861259/ /pubmed/33733100 http://dx.doi.org/10.3389/frai.2019.00011 Text en Copyright © 2019 Grieve, Montgomery, Nini, Murakami and Guo. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Artificial Intelligence Grieve, Jack Montgomery, Chris Nini, Andrea Murakami, Akira Guo, Diansheng Mapping Lexical Dialect Variation in British English Using Twitter |
title | Mapping Lexical Dialect Variation in British English Using Twitter |
title_full | Mapping Lexical Dialect Variation in British English Using Twitter |
title_fullStr | Mapping Lexical Dialect Variation in British English Using Twitter |
title_full_unstemmed | Mapping Lexical Dialect Variation in British English Using Twitter |
title_short | Mapping Lexical Dialect Variation in British English Using Twitter |
title_sort | mapping lexical dialect variation in british english using twitter |
topic | Artificial Intelligence |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7861259/ https://www.ncbi.nlm.nih.gov/pubmed/33733100 http://dx.doi.org/10.3389/frai.2019.00011 |
work_keys_str_mv | AT grievejack mappinglexicaldialectvariationinbritishenglishusingtwitter AT montgomerychris mappinglexicaldialectvariationinbritishenglishusingtwitter AT niniandrea mappinglexicaldialectvariationinbritishenglishusingtwitter AT murakamiakira mappinglexicaldialectvariationinbritishenglishusingtwitter AT guodiansheng mappinglexicaldialectvariationinbritishenglishusingtwitter |