Cargando…

Reliability of crowdsourcing as a method for collecting emotions labels on pictures

OBJECTIVE: In this paper we study if and under what conditions crowdsourcing can be used as a reliable method for collecting high-quality emotion labels on pictures. To this end, we run a set of crowdsourcing experiments on the widely used IAPS dataset, using the Self-Assessment Manikin (SAM) emotio...

Descripción completa

Detalles Bibliográficos
Autores principales: Korovina, Olga, Baez, Marcos, Casati, Fabio
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6822440/
https://www.ncbi.nlm.nih.gov/pubmed/31666124
http://dx.doi.org/10.1186/s13104-019-4764-4
_version_ 1783464336578052096
author Korovina, Olga
Baez, Marcos
Casati, Fabio
author_facet Korovina, Olga
Baez, Marcos
Casati, Fabio
author_sort Korovina, Olga
collection PubMed
description OBJECTIVE: In this paper we study if and under what conditions crowdsourcing can be used as a reliable method for collecting high-quality emotion labels on pictures. To this end, we run a set of crowdsourcing experiments on the widely used IAPS dataset, using the Self-Assessment Manikin (SAM) emotion collection instrument, in order to rate pictures on valence, arousal and dominance, and explore the consistency of crowdsourced results across multiple runs (reliability) and the level of agreement with the gold labels (quality). In doing so, we explored the impact of targeting populations of different level of reputation (and cost) and collecting varying numbers of ratings per picture. RESULTS: The results tell us that crowdsourcing can be a reliable method, reaching excellent levels of reliability and agreement with only 3 ratings per picture for valence and 8 per arousal, with only marginal difference between target populations. Results for dominance were very poor, echoing previous studies on the data collection instrument used. We also observed that specific types of content generate diverging opinions in participants (leading to higher variability or multimodal distributions), which remain consistent across pictures of the same theme. These can inform the data collection and exploitation of crowdsourced emotion datasets.
format Online
Article
Text
id pubmed-6822440
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-68224402019-11-06 Reliability of crowdsourcing as a method for collecting emotions labels on pictures Korovina, Olga Baez, Marcos Casati, Fabio BMC Res Notes Research Note OBJECTIVE: In this paper we study if and under what conditions crowdsourcing can be used as a reliable method for collecting high-quality emotion labels on pictures. To this end, we run a set of crowdsourcing experiments on the widely used IAPS dataset, using the Self-Assessment Manikin (SAM) emotion collection instrument, in order to rate pictures on valence, arousal and dominance, and explore the consistency of crowdsourced results across multiple runs (reliability) and the level of agreement with the gold labels (quality). In doing so, we explored the impact of targeting populations of different level of reputation (and cost) and collecting varying numbers of ratings per picture. RESULTS: The results tell us that crowdsourcing can be a reliable method, reaching excellent levels of reliability and agreement with only 3 ratings per picture for valence and 8 per arousal, with only marginal difference between target populations. Results for dominance were very poor, echoing previous studies on the data collection instrument used. We also observed that specific types of content generate diverging opinions in participants (leading to higher variability or multimodal distributions), which remain consistent across pictures of the same theme. These can inform the data collection and exploitation of crowdsourced emotion datasets. BioMed Central 2019-10-30 /pmc/articles/PMC6822440/ /pubmed/31666124 http://dx.doi.org/10.1186/s13104-019-4764-4 Text en © The Author(s) 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Note
Korovina, Olga
Baez, Marcos
Casati, Fabio
Reliability of crowdsourcing as a method for collecting emotions labels on pictures
title Reliability of crowdsourcing as a method for collecting emotions labels on pictures
title_full Reliability of crowdsourcing as a method for collecting emotions labels on pictures
title_fullStr Reliability of crowdsourcing as a method for collecting emotions labels on pictures
title_full_unstemmed Reliability of crowdsourcing as a method for collecting emotions labels on pictures
title_short Reliability of crowdsourcing as a method for collecting emotions labels on pictures
title_sort reliability of crowdsourcing as a method for collecting emotions labels on pictures
topic Research Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6822440/
https://www.ncbi.nlm.nih.gov/pubmed/31666124
http://dx.doi.org/10.1186/s13104-019-4764-4
work_keys_str_mv AT korovinaolga reliabilityofcrowdsourcingasamethodforcollectingemotionslabelsonpictures
AT baezmarcos reliabilityofcrowdsourcingasamethodforcollectingemotionslabelsonpictures
AT casatifabio reliabilityofcrowdsourcingasamethodforcollectingemotionslabelsonpictures