Cargando…

Social Media and Language Processing: How Facebook and Twitter Provide the Best Frequency Estimates for Studying Word Recognition

Corpus‐based word frequencies are one of the most important predictors in language processing tasks. Frequencies based on conversational corpora (such as movie subtitles) are shown to better capture the variance in lexical decision tasks compared to traditional corpora. In this study, we show that f...

Descripción completa

Detalles Bibliográficos
Autores principales: Herdağdelen, Amaç, Marelli, Marco
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5484375/
https://www.ncbi.nlm.nih.gov/pubmed/27477913
http://dx.doi.org/10.1111/cogs.12392
_version_ 1783245875518111744
author Herdağdelen, Amaç
Marelli, Marco
author_facet Herdağdelen, Amaç
Marelli, Marco
author_sort Herdağdelen, Amaç
collection PubMed
description Corpus‐based word frequencies are one of the most important predictors in language processing tasks. Frequencies based on conversational corpora (such as movie subtitles) are shown to better capture the variance in lexical decision tasks compared to traditional corpora. In this study, we show that frequencies computed from social media are currently the best frequency‐based estimators of lexical decision reaction times (up to 3.6% increase in explained variance). The results are robust (observed for Twitter‐ and Facebook‐based frequencies on American English and British English datasets) and are still substantial when we control for corpus size.
format Online
Article
Text
id pubmed-5484375
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-54843752017-07-10 Social Media and Language Processing: How Facebook and Twitter Provide the Best Frequency Estimates for Studying Word Recognition Herdağdelen, Amaç Marelli, Marco Cogn Sci Regular Articles Corpus‐based word frequencies are one of the most important predictors in language processing tasks. Frequencies based on conversational corpora (such as movie subtitles) are shown to better capture the variance in lexical decision tasks compared to traditional corpora. In this study, we show that frequencies computed from social media are currently the best frequency‐based estimators of lexical decision reaction times (up to 3.6% increase in explained variance). The results are robust (observed for Twitter‐ and Facebook‐based frequencies on American English and British English datasets) and are still substantial when we control for corpus size. John Wiley and Sons Inc. 2016-08-01 2017-05 /pmc/articles/PMC5484375/ /pubmed/27477913 http://dx.doi.org/10.1111/cogs.12392 Text en © 2016 The Authors. Cognitive Science published by Wiley Periodicals, Inc. on behalf of Cognitive Science Society. This is an open access article under the terms of the Creative Commons Attribution (http://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Regular Articles
Herdağdelen, Amaç
Marelli, Marco
Social Media and Language Processing: How Facebook and Twitter Provide the Best Frequency Estimates for Studying Word Recognition
title Social Media and Language Processing: How Facebook and Twitter Provide the Best Frequency Estimates for Studying Word Recognition
title_full Social Media and Language Processing: How Facebook and Twitter Provide the Best Frequency Estimates for Studying Word Recognition
title_fullStr Social Media and Language Processing: How Facebook and Twitter Provide the Best Frequency Estimates for Studying Word Recognition
title_full_unstemmed Social Media and Language Processing: How Facebook and Twitter Provide the Best Frequency Estimates for Studying Word Recognition
title_short Social Media and Language Processing: How Facebook and Twitter Provide the Best Frequency Estimates for Studying Word Recognition
title_sort social media and language processing: how facebook and twitter provide the best frequency estimates for studying word recognition
topic Regular Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5484375/
https://www.ncbi.nlm.nih.gov/pubmed/27477913
http://dx.doi.org/10.1111/cogs.12392
work_keys_str_mv AT herdagdelenamac socialmediaandlanguageprocessinghowfacebookandtwitterprovidethebestfrequencyestimatesforstudyingwordrecognition
AT marellimarco socialmediaandlanguageprocessinghowfacebookandtwitterprovidethebestfrequencyestimatesforstudyingwordrecognition