Cargando…

The Croatian psycholinguistic database: Estimates for 6000 nouns, verbs, adjectives and adverbs

Psycholinguistic databases containing ratings of concreteness, imageability, age of acquisition, and subjective frequency are used in psycholinguistic and neurolinguistic studies which require words as stimuli. Linguistic characteristics (e.g. word length, corpus frequency) are frequently coded, but...

Descripción completa

Detalles Bibliográficos
Autores principales: Peti-Stantić, Anita, Anđel, Maja, Gnjidić, Vedrana, Keresteš, Gordana, Ljubešić, Nikola, Masnikosa, Irina, Tonković, Mirjana, Tušek, Jelena, Willer-Gold, Jana, Stanojević, Mateusz-Milan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer US 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8367916/
https://www.ncbi.nlm.nih.gov/pubmed/33904142
http://dx.doi.org/10.3758/s13428-020-01533-x
_version_ 1783739113768222720
author Peti-Stantić, Anita
Anđel, Maja
Gnjidić, Vedrana
Keresteš, Gordana
Ljubešić, Nikola
Masnikosa, Irina
Tonković, Mirjana
Tušek, Jelena
Willer-Gold, Jana
Stanojević, Mateusz-Milan
author_facet Peti-Stantić, Anita
Anđel, Maja
Gnjidić, Vedrana
Keresteš, Gordana
Ljubešić, Nikola
Masnikosa, Irina
Tonković, Mirjana
Tušek, Jelena
Willer-Gold, Jana
Stanojević, Mateusz-Milan
author_sort Peti-Stantić, Anita
collection PubMed
description Psycholinguistic databases containing ratings of concreteness, imageability, age of acquisition, and subjective frequency are used in psycholinguistic and neurolinguistic studies which require words as stimuli. Linguistic characteristics (e.g. word length, corpus frequency) are frequently coded, but word class is seldom systematically treated, although there are indications of its significance for imageability and concreteness. This paper presents the Croatian Psycholinguistic Database (CPD; available at: 10.17234/megahr.2019.hpb), containing 6000 Croatian nouns, verbs, adjectives and adverbs, rated for concreteness, imageability, age of acquisition, and subjective frequency. Moreover, we present computationally obtained extrapolations of concreteness and imageability to the remainder of the Croatian lexicon (available at: https://github.com/megahr/lexicon/blob/master/predictions/hr_c_i.predictions.txt). In the two studies presented here, we explore the significance of word class for concreteness and imageability in human and computationally obtained ratings. The observed correlations in the CPD indicate correspondences between psycholinguistic measures expected from the literature. Word classes exhibit differences in subjective frequency, age of acquisition, concreteness and imageability, with significant differences between nouns, verbs, adjectives and adverbs. In the computational study which focused on concreteness and imageability, concreteness obtained higher correlations with human ratings than imageability, and the system underpredicted the concreteness of nouns, and overpredicted the concreteness of adjectives and adverbs. Overall, this suggests that word class contains schematic conceptual and distributional information. Schematic conceptual content seems to be more significant in human ratings of concreteness and less significant in computationally obtained ratings, where distributional information seems to play a more significant role. This suggests that word class differences should be theoretically explored.
format Online
Article
Text
id pubmed-8367916
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Springer US
record_format MEDLINE/PubMed
spelling pubmed-83679162021-08-31 The Croatian psycholinguistic database: Estimates for 6000 nouns, verbs, adjectives and adverbs Peti-Stantić, Anita Anđel, Maja Gnjidić, Vedrana Keresteš, Gordana Ljubešić, Nikola Masnikosa, Irina Tonković, Mirjana Tušek, Jelena Willer-Gold, Jana Stanojević, Mateusz-Milan Behav Res Methods Article Psycholinguistic databases containing ratings of concreteness, imageability, age of acquisition, and subjective frequency are used in psycholinguistic and neurolinguistic studies which require words as stimuli. Linguistic characteristics (e.g. word length, corpus frequency) are frequently coded, but word class is seldom systematically treated, although there are indications of its significance for imageability and concreteness. This paper presents the Croatian Psycholinguistic Database (CPD; available at: 10.17234/megahr.2019.hpb), containing 6000 Croatian nouns, verbs, adjectives and adverbs, rated for concreteness, imageability, age of acquisition, and subjective frequency. Moreover, we present computationally obtained extrapolations of concreteness and imageability to the remainder of the Croatian lexicon (available at: https://github.com/megahr/lexicon/blob/master/predictions/hr_c_i.predictions.txt). In the two studies presented here, we explore the significance of word class for concreteness and imageability in human and computationally obtained ratings. The observed correlations in the CPD indicate correspondences between psycholinguistic measures expected from the literature. Word classes exhibit differences in subjective frequency, age of acquisition, concreteness and imageability, with significant differences between nouns, verbs, adjectives and adverbs. In the computational study which focused on concreteness and imageability, concreteness obtained higher correlations with human ratings than imageability, and the system underpredicted the concreteness of nouns, and overpredicted the concreteness of adjectives and adverbs. Overall, this suggests that word class contains schematic conceptual and distributional information. Schematic conceptual content seems to be more significant in human ratings of concreteness and less significant in computationally obtained ratings, where distributional information seems to play a more significant role. This suggests that word class differences should be theoretically explored. Springer US 2021-04-26 2021 /pmc/articles/PMC8367916/ /pubmed/33904142 http://dx.doi.org/10.3758/s13428-020-01533-x Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Peti-Stantić, Anita
Anđel, Maja
Gnjidić, Vedrana
Keresteš, Gordana
Ljubešić, Nikola
Masnikosa, Irina
Tonković, Mirjana
Tušek, Jelena
Willer-Gold, Jana
Stanojević, Mateusz-Milan
The Croatian psycholinguistic database: Estimates for 6000 nouns, verbs, adjectives and adverbs
title The Croatian psycholinguistic database: Estimates for 6000 nouns, verbs, adjectives and adverbs
title_full The Croatian psycholinguistic database: Estimates for 6000 nouns, verbs, adjectives and adverbs
title_fullStr The Croatian psycholinguistic database: Estimates for 6000 nouns, verbs, adjectives and adverbs
title_full_unstemmed The Croatian psycholinguistic database: Estimates for 6000 nouns, verbs, adjectives and adverbs
title_short The Croatian psycholinguistic database: Estimates for 6000 nouns, verbs, adjectives and adverbs
title_sort croatian psycholinguistic database: estimates for 6000 nouns, verbs, adjectives and adverbs
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8367916/
https://www.ncbi.nlm.nih.gov/pubmed/33904142
http://dx.doi.org/10.3758/s13428-020-01533-x
work_keys_str_mv AT petistanticanita thecroatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs
AT anđelmaja thecroatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs
AT gnjidicvedrana thecroatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs
AT kerestesgordana thecroatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs
AT ljubesicnikola thecroatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs
AT masnikosairina thecroatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs
AT tonkovicmirjana thecroatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs
AT tusekjelena thecroatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs
AT willergoldjana thecroatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs
AT stanojevicmateuszmilan thecroatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs
AT petistanticanita croatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs
AT anđelmaja croatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs
AT gnjidicvedrana croatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs
AT kerestesgordana croatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs
AT ljubesicnikola croatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs
AT masnikosairina croatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs
AT tonkovicmirjana croatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs
AT tusekjelena croatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs
AT willergoldjana croatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs
AT stanojevicmateuszmilan croatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs