Cargando…

Social Media and Language Processing: How Facebook and Twitter Provide the Best Frequency Estimates for Studying Word Recognition

Corpus‐based word frequencies are one of the most important predictors in language processing tasks. Frequencies based on conversational corpora (such as movie subtitles) are shown to better capture the variance in lexical decision tasks compared to traditional corpora. In this study, we show that f...

Descripción completa

Detalles Bibliográficos
Autores principales: Herdağdelen, Amaç, Marelli, Marco
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5484375/
https://www.ncbi.nlm.nih.gov/pubmed/27477913
http://dx.doi.org/10.1111/cogs.12392
Descripción
Sumario:Corpus‐based word frequencies are one of the most important predictors in language processing tasks. Frequencies based on conversational corpora (such as movie subtitles) are shown to better capture the variance in lexical decision tasks compared to traditional corpora. In this study, we show that frequencies computed from social media are currently the best frequency‐based estimators of lexical decision reaction times (up to 3.6% increase in explained variance). The results are robust (observed for Twitter‐ and Facebook‐based frequencies on American English and British English datasets) and are still substantial when we control for corpus size.