Cargando…
The Croatian psycholinguistic database: Estimates for 6000 nouns, verbs, adjectives and adverbs
Psycholinguistic databases containing ratings of concreteness, imageability, age of acquisition, and subjective frequency are used in psycholinguistic and neurolinguistic studies which require words as stimuli. Linguistic characteristics (e.g. word length, corpus frequency) are frequently coded, but...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer US
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8367916/ https://www.ncbi.nlm.nih.gov/pubmed/33904142 http://dx.doi.org/10.3758/s13428-020-01533-x |
_version_ | 1783739113768222720 |
---|---|
author | Peti-Stantić, Anita Anđel, Maja Gnjidić, Vedrana Keresteš, Gordana Ljubešić, Nikola Masnikosa, Irina Tonković, Mirjana Tušek, Jelena Willer-Gold, Jana Stanojević, Mateusz-Milan |
author_facet | Peti-Stantić, Anita Anđel, Maja Gnjidić, Vedrana Keresteš, Gordana Ljubešić, Nikola Masnikosa, Irina Tonković, Mirjana Tušek, Jelena Willer-Gold, Jana Stanojević, Mateusz-Milan |
author_sort | Peti-Stantić, Anita |
collection | PubMed |
description | Psycholinguistic databases containing ratings of concreteness, imageability, age of acquisition, and subjective frequency are used in psycholinguistic and neurolinguistic studies which require words as stimuli. Linguistic characteristics (e.g. word length, corpus frequency) are frequently coded, but word class is seldom systematically treated, although there are indications of its significance for imageability and concreteness. This paper presents the Croatian Psycholinguistic Database (CPD; available at: 10.17234/megahr.2019.hpb), containing 6000 Croatian nouns, verbs, adjectives and adverbs, rated for concreteness, imageability, age of acquisition, and subjective frequency. Moreover, we present computationally obtained extrapolations of concreteness and imageability to the remainder of the Croatian lexicon (available at: https://github.com/megahr/lexicon/blob/master/predictions/hr_c_i.predictions.txt). In the two studies presented here, we explore the significance of word class for concreteness and imageability in human and computationally obtained ratings. The observed correlations in the CPD indicate correspondences between psycholinguistic measures expected from the literature. Word classes exhibit differences in subjective frequency, age of acquisition, concreteness and imageability, with significant differences between nouns, verbs, adjectives and adverbs. In the computational study which focused on concreteness and imageability, concreteness obtained higher correlations with human ratings than imageability, and the system underpredicted the concreteness of nouns, and overpredicted the concreteness of adjectives and adverbs. Overall, this suggests that word class contains schematic conceptual and distributional information. Schematic conceptual content seems to be more significant in human ratings of concreteness and less significant in computationally obtained ratings, where distributional information seems to play a more significant role. This suggests that word class differences should be theoretically explored. |
format | Online Article Text |
id | pubmed-8367916 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Springer US |
record_format | MEDLINE/PubMed |
spelling | pubmed-83679162021-08-31 The Croatian psycholinguistic database: Estimates for 6000 nouns, verbs, adjectives and adverbs Peti-Stantić, Anita Anđel, Maja Gnjidić, Vedrana Keresteš, Gordana Ljubešić, Nikola Masnikosa, Irina Tonković, Mirjana Tušek, Jelena Willer-Gold, Jana Stanojević, Mateusz-Milan Behav Res Methods Article Psycholinguistic databases containing ratings of concreteness, imageability, age of acquisition, and subjective frequency are used in psycholinguistic and neurolinguistic studies which require words as stimuli. Linguistic characteristics (e.g. word length, corpus frequency) are frequently coded, but word class is seldom systematically treated, although there are indications of its significance for imageability and concreteness. This paper presents the Croatian Psycholinguistic Database (CPD; available at: 10.17234/megahr.2019.hpb), containing 6000 Croatian nouns, verbs, adjectives and adverbs, rated for concreteness, imageability, age of acquisition, and subjective frequency. Moreover, we present computationally obtained extrapolations of concreteness and imageability to the remainder of the Croatian lexicon (available at: https://github.com/megahr/lexicon/blob/master/predictions/hr_c_i.predictions.txt). In the two studies presented here, we explore the significance of word class for concreteness and imageability in human and computationally obtained ratings. The observed correlations in the CPD indicate correspondences between psycholinguistic measures expected from the literature. Word classes exhibit differences in subjective frequency, age of acquisition, concreteness and imageability, with significant differences between nouns, verbs, adjectives and adverbs. In the computational study which focused on concreteness and imageability, concreteness obtained higher correlations with human ratings than imageability, and the system underpredicted the concreteness of nouns, and overpredicted the concreteness of adjectives and adverbs. Overall, this suggests that word class contains schematic conceptual and distributional information. Schematic conceptual content seems to be more significant in human ratings of concreteness and less significant in computationally obtained ratings, where distributional information seems to play a more significant role. This suggests that word class differences should be theoretically explored. Springer US 2021-04-26 2021 /pmc/articles/PMC8367916/ /pubmed/33904142 http://dx.doi.org/10.3758/s13428-020-01533-x Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Peti-Stantić, Anita Anđel, Maja Gnjidić, Vedrana Keresteš, Gordana Ljubešić, Nikola Masnikosa, Irina Tonković, Mirjana Tušek, Jelena Willer-Gold, Jana Stanojević, Mateusz-Milan The Croatian psycholinguistic database: Estimates for 6000 nouns, verbs, adjectives and adverbs |
title | The Croatian psycholinguistic database: Estimates for 6000 nouns, verbs, adjectives and adverbs |
title_full | The Croatian psycholinguistic database: Estimates for 6000 nouns, verbs, adjectives and adverbs |
title_fullStr | The Croatian psycholinguistic database: Estimates for 6000 nouns, verbs, adjectives and adverbs |
title_full_unstemmed | The Croatian psycholinguistic database: Estimates for 6000 nouns, verbs, adjectives and adverbs |
title_short | The Croatian psycholinguistic database: Estimates for 6000 nouns, verbs, adjectives and adverbs |
title_sort | croatian psycholinguistic database: estimates for 6000 nouns, verbs, adjectives and adverbs |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8367916/ https://www.ncbi.nlm.nih.gov/pubmed/33904142 http://dx.doi.org/10.3758/s13428-020-01533-x |
work_keys_str_mv | AT petistanticanita thecroatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs AT anđelmaja thecroatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs AT gnjidicvedrana thecroatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs AT kerestesgordana thecroatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs AT ljubesicnikola thecroatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs AT masnikosairina thecroatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs AT tonkovicmirjana thecroatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs AT tusekjelena thecroatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs AT willergoldjana thecroatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs AT stanojevicmateuszmilan thecroatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs AT petistanticanita croatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs AT anđelmaja croatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs AT gnjidicvedrana croatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs AT kerestesgordana croatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs AT ljubesicnikola croatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs AT masnikosairina croatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs AT tonkovicmirjana croatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs AT tusekjelena croatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs AT willergoldjana croatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs AT stanojevicmateuszmilan croatianpsycholinguisticdatabaseestimatesfor6000nounsverbsadjectivesandadverbs |