Cargando…

Establishing semantic relatedness through ratings, reaction times, and semantic vectors: A database in Polish

This study presents a Polish semantic priming dataset and semantic similarity ratings for word pairs obtained with native Polish speakers, as well as a range of semantic spaces. The word pairs include strongly related, weakly related, and semantically unrelated word pairs. The rating study (Experime...

Descripción completa

Detalles Bibliográficos
Autores principales: Rataj, Karolina, Kakuba, Patrycja, Mandera, Paweł, van Heuven, Walter J. B.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10124854/
https://www.ncbi.nlm.nih.gov/pubmed/37093824
http://dx.doi.org/10.1371/journal.pone.0284801
_version_ 1785029921322565632
author Rataj, Karolina
Kakuba, Patrycja
Mandera, Paweł
van Heuven, Walter J. B.
author_facet Rataj, Karolina
Kakuba, Patrycja
Mandera, Paweł
van Heuven, Walter J. B.
author_sort Rataj, Karolina
collection PubMed
description This study presents a Polish semantic priming dataset and semantic similarity ratings for word pairs obtained with native Polish speakers, as well as a range of semantic spaces. The word pairs include strongly related, weakly related, and semantically unrelated word pairs. The rating study (Experiment 1) confirmed that the three conditions differed in semantic relatedness. The semantic priming lexical decision study with a carefully matched subset of the stimuli (Experiment 2), revealed strong semantic priming effects for strongly related word pairs, whereas weakly related word pairs showed a smaller but still significant priming effect relative to semantically unrelated word pairs. The datasets of both experiments and those of SimLex-999 for Polish were then used in a robust semantic model selection from existing and newly trained semantic spaces. This database of semantic vectors, semantic relatedness ratings, and behavioral data collected for all word pairs enable future researchers to benchmark new vectors against this dataset. Furthermore, the new vectors are made freely available for researchers. Although similar semantically strongly and weakly related word pairs are available in other languages, this is the first freely available database for Polish, that combines measures of semantic distance and human data.
format Online
Article
Text
id pubmed-10124854
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-101248542023-04-25 Establishing semantic relatedness through ratings, reaction times, and semantic vectors: A database in Polish Rataj, Karolina Kakuba, Patrycja Mandera, Paweł van Heuven, Walter J. B. PLoS One Research Article This study presents a Polish semantic priming dataset and semantic similarity ratings for word pairs obtained with native Polish speakers, as well as a range of semantic spaces. The word pairs include strongly related, weakly related, and semantically unrelated word pairs. The rating study (Experiment 1) confirmed that the three conditions differed in semantic relatedness. The semantic priming lexical decision study with a carefully matched subset of the stimuli (Experiment 2), revealed strong semantic priming effects for strongly related word pairs, whereas weakly related word pairs showed a smaller but still significant priming effect relative to semantically unrelated word pairs. The datasets of both experiments and those of SimLex-999 for Polish were then used in a robust semantic model selection from existing and newly trained semantic spaces. This database of semantic vectors, semantic relatedness ratings, and behavioral data collected for all word pairs enable future researchers to benchmark new vectors against this dataset. Furthermore, the new vectors are made freely available for researchers. Although similar semantically strongly and weakly related word pairs are available in other languages, this is the first freely available database for Polish, that combines measures of semantic distance and human data. Public Library of Science 2023-04-24 /pmc/articles/PMC10124854/ /pubmed/37093824 http://dx.doi.org/10.1371/journal.pone.0284801 Text en © 2023 Rataj et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Rataj, Karolina
Kakuba, Patrycja
Mandera, Paweł
van Heuven, Walter J. B.
Establishing semantic relatedness through ratings, reaction times, and semantic vectors: A database in Polish
title Establishing semantic relatedness through ratings, reaction times, and semantic vectors: A database in Polish
title_full Establishing semantic relatedness through ratings, reaction times, and semantic vectors: A database in Polish
title_fullStr Establishing semantic relatedness through ratings, reaction times, and semantic vectors: A database in Polish
title_full_unstemmed Establishing semantic relatedness through ratings, reaction times, and semantic vectors: A database in Polish
title_short Establishing semantic relatedness through ratings, reaction times, and semantic vectors: A database in Polish
title_sort establishing semantic relatedness through ratings, reaction times, and semantic vectors: a database in polish
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10124854/
https://www.ncbi.nlm.nih.gov/pubmed/37093824
http://dx.doi.org/10.1371/journal.pone.0284801
work_keys_str_mv AT ratajkarolina establishingsemanticrelatednessthroughratingsreactiontimesandsemanticvectorsadatabaseinpolish
AT kakubapatrycja establishingsemanticrelatednessthroughratingsreactiontimesandsemanticvectorsadatabaseinpolish
AT manderapaweł establishingsemanticrelatednessthroughratingsreactiontimesandsemanticvectorsadatabaseinpolish
AT vanheuvenwalterjb establishingsemanticrelatednessthroughratingsreactiontimesandsemanticvectorsadatabaseinpolish