Cargando…

Comparing Distributions of Color Words: Pitfalls and Metric Choices

Computational methods have started playing a significant role in semantic analysis. One particularly accessible area for developing good computational methods for linguistic semantics is in color naming, where perceptual dissimilarity measures provide a geometric setting for the analyses. This setti...

Descripción completa

Detalles Bibliográficos
Autores principales: Vejdemo-Johansson, Mikael, Vejdemo, Susanne, Ek, Carl-Henrik
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3934892/
https://www.ncbi.nlm.nih.gov/pubmed/24586580
http://dx.doi.org/10.1371/journal.pone.0089184
_version_ 1782305117584752640
author Vejdemo-Johansson, Mikael
Vejdemo, Susanne
Ek, Carl-Henrik
author_facet Vejdemo-Johansson, Mikael
Vejdemo, Susanne
Ek, Carl-Henrik
author_sort Vejdemo-Johansson, Mikael
collection PubMed
description Computational methods have started playing a significant role in semantic analysis. One particularly accessible area for developing good computational methods for linguistic semantics is in color naming, where perceptual dissimilarity measures provide a geometric setting for the analyses. This setting has been studied first by Berlin & Kay in 1969, and then later on by a large data collection effort: the World Color Survey (WCS). From the WCS, a dataset on color naming by 2 616 speakers of 110 different languages is made available for further research. In the analysis of color naming from WCS, however, the choice of analysis method is an important factor of the analysis. We demonstrate concrete problems with the choice of metrics made in recent analyses of WCS data, and offer approaches for dealing with the problems we can identify. Picking a metric for the space of color naming distributions that ignores perceptual distances between colors assumes a decorrelated system, where strong spatial correlations in fact exist. We can demonstrate that the corresponding issues are significantly improved when using Earth Mover's Distance, or Quadratic [Image: see text]-square Distance, and we can approximate these solutions with a kernel-based analysis method.
format Online
Article
Text
id pubmed-3934892
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-39348922014-03-04 Comparing Distributions of Color Words: Pitfalls and Metric Choices Vejdemo-Johansson, Mikael Vejdemo, Susanne Ek, Carl-Henrik PLoS One Research Article Computational methods have started playing a significant role in semantic analysis. One particularly accessible area for developing good computational methods for linguistic semantics is in color naming, where perceptual dissimilarity measures provide a geometric setting for the analyses. This setting has been studied first by Berlin & Kay in 1969, and then later on by a large data collection effort: the World Color Survey (WCS). From the WCS, a dataset on color naming by 2 616 speakers of 110 different languages is made available for further research. In the analysis of color naming from WCS, however, the choice of analysis method is an important factor of the analysis. We demonstrate concrete problems with the choice of metrics made in recent analyses of WCS data, and offer approaches for dealing with the problems we can identify. Picking a metric for the space of color naming distributions that ignores perceptual distances between colors assumes a decorrelated system, where strong spatial correlations in fact exist. We can demonstrate that the corresponding issues are significantly improved when using Earth Mover's Distance, or Quadratic [Image: see text]-square Distance, and we can approximate these solutions with a kernel-based analysis method. Public Library of Science 2014-02-25 /pmc/articles/PMC3934892/ /pubmed/24586580 http://dx.doi.org/10.1371/journal.pone.0089184 Text en © 2014 Vejdemo-Johansson et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Vejdemo-Johansson, Mikael
Vejdemo, Susanne
Ek, Carl-Henrik
Comparing Distributions of Color Words: Pitfalls and Metric Choices
title Comparing Distributions of Color Words: Pitfalls and Metric Choices
title_full Comparing Distributions of Color Words: Pitfalls and Metric Choices
title_fullStr Comparing Distributions of Color Words: Pitfalls and Metric Choices
title_full_unstemmed Comparing Distributions of Color Words: Pitfalls and Metric Choices
title_short Comparing Distributions of Color Words: Pitfalls and Metric Choices
title_sort comparing distributions of color words: pitfalls and metric choices
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3934892/
https://www.ncbi.nlm.nih.gov/pubmed/24586580
http://dx.doi.org/10.1371/journal.pone.0089184
work_keys_str_mv AT vejdemojohanssonmikael comparingdistributionsofcolorwordspitfallsandmetricchoices
AT vejdemosusanne comparingdistributionsofcolorwordspitfallsandmetricchoices
AT ekcarlhenrik comparingdistributionsofcolorwordspitfallsandmetricchoices