Cargando…

Improving Crowdsourcing-Based Image Classification Through Expanded Input Elicitation and Machine Learning

This work investigates how different forms of input elicitation obtained from crowdsourcing can be utilized to improve the quality of inferred labels for image classification tasks, where an image must be labeled as either positive or negative depending on the presence/absence of a specified object....

Descripción completa

Detalles Bibliográficos
Autores principales:	Yasmin, Romena, Hassan, Md Mahmudulla, Grassel, Joshua T., Bhogaraju, Harika, Escobedo, Adolfo R., Fuentes, Olac
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2022
Materias:	Artificial Intelligence
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9276979/ https://www.ncbi.nlm.nih.gov/pubmed/35845435 http://dx.doi.org/10.3389/frai.2022.848056

_version_	1784745839840722944
author	Yasmin, Romena Hassan, Md Mahmudulla Grassel, Joshua T. Bhogaraju, Harika Escobedo, Adolfo R. Fuentes, Olac
author_facet	Yasmin, Romena Hassan, Md Mahmudulla Grassel, Joshua T. Bhogaraju, Harika Escobedo, Adolfo R. Fuentes, Olac
author_sort	Yasmin, Romena
collection	PubMed
description	This work investigates how different forms of input elicitation obtained from crowdsourcing can be utilized to improve the quality of inferred labels for image classification tasks, where an image must be labeled as either positive or negative depending on the presence/absence of a specified object. Five types of input elicitation methods are tested: binary classification (positive or negative); the (x, y)-coordinate of the position participants believe a target object is located; level of confidence in binary response (on a scale from 0 to 100%); what participants believe the majority of the other participants' binary classification is; and participant's perceived difficulty level of the task (on a discrete scale). We design two crowdsourcing studies to test the performance of a variety of input elicitation methods and utilize data from over 300 participants. Various existing voting and machine learning (ML) methods are applied to make the best use of these inputs. In an effort to assess their performance on classification tasks of varying difficulty, a systematic synthetic image generation process is developed. Each generated image combines items from the MPEG-7 Core Experiment CE-Shape-1 Test Set into a single image using multiple parameters (e.g., density, transparency, etc.) and may or may not contain a target object. The difficulty of these images is validated by the performance of an automated image classification method. Experiment results suggest that more accurate results can be achieved with smaller training datasets when both the crowdsourced binary classification labels and the average of the self-reported confidence values in these labels are used as features for the ML classifiers. Moreover, when a relatively larger properly annotated dataset is available, in some cases augmenting these ML algorithms with the results (i.e., probability of outcome) from an automated classifier can achieve even higher performance than what can be obtained by using any one of the individual classifiers. Lastly, supplementary analysis of the collected data demonstrates that other performance metrics of interest, namely reduced false-negative rates, can be prioritized through special modifications of the proposed aggregation methods.
format	Online Article Text
id	pubmed-9276979
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-92769792022-07-14 Improving Crowdsourcing-Based Image Classification Through Expanded Input Elicitation and Machine Learning Yasmin, Romena Hassan, Md Mahmudulla Grassel, Joshua T. Bhogaraju, Harika Escobedo, Adolfo R. Fuentes, Olac Front Artif Intell Artificial Intelligence This work investigates how different forms of input elicitation obtained from crowdsourcing can be utilized to improve the quality of inferred labels for image classification tasks, where an image must be labeled as either positive or negative depending on the presence/absence of a specified object. Five types of input elicitation methods are tested: binary classification (positive or negative); the (x, y)-coordinate of the position participants believe a target object is located; level of confidence in binary response (on a scale from 0 to 100%); what participants believe the majority of the other participants' binary classification is; and participant's perceived difficulty level of the task (on a discrete scale). We design two crowdsourcing studies to test the performance of a variety of input elicitation methods and utilize data from over 300 participants. Various existing voting and machine learning (ML) methods are applied to make the best use of these inputs. In an effort to assess their performance on classification tasks of varying difficulty, a systematic synthetic image generation process is developed. Each generated image combines items from the MPEG-7 Core Experiment CE-Shape-1 Test Set into a single image using multiple parameters (e.g., density, transparency, etc.) and may or may not contain a target object. The difficulty of these images is validated by the performance of an automated image classification method. Experiment results suggest that more accurate results can be achieved with smaller training datasets when both the crowdsourced binary classification labels and the average of the self-reported confidence values in these labels are used as features for the ML classifiers. Moreover, when a relatively larger properly annotated dataset is available, in some cases augmenting these ML algorithms with the results (i.e., probability of outcome) from an automated classifier can achieve even higher performance than what can be obtained by using any one of the individual classifiers. Lastly, supplementary analysis of the collected data demonstrates that other performance metrics of interest, namely reduced false-negative rates, can be prioritized through special modifications of the proposed aggregation methods. Frontiers Media S.A. 2022-06-29 /pmc/articles/PMC9276979/ /pubmed/35845435 http://dx.doi.org/10.3389/frai.2022.848056 Text en Copyright © 2022 Yasmin, Hassan, Grassel, Bhogaraju, Escobedo and Fuentes. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Artificial Intelligence Yasmin, Romena Hassan, Md Mahmudulla Grassel, Joshua T. Bhogaraju, Harika Escobedo, Adolfo R. Fuentes, Olac Improving Crowdsourcing-Based Image Classification Through Expanded Input Elicitation and Machine Learning
title	Improving Crowdsourcing-Based Image Classification Through Expanded Input Elicitation and Machine Learning
title_full	Improving Crowdsourcing-Based Image Classification Through Expanded Input Elicitation and Machine Learning
title_fullStr	Improving Crowdsourcing-Based Image Classification Through Expanded Input Elicitation and Machine Learning
title_full_unstemmed	Improving Crowdsourcing-Based Image Classification Through Expanded Input Elicitation and Machine Learning
title_short	Improving Crowdsourcing-Based Image Classification Through Expanded Input Elicitation and Machine Learning
title_sort	improving crowdsourcing-based image classification through expanded input elicitation and machine learning
topic	Artificial Intelligence
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9276979/ https://www.ncbi.nlm.nih.gov/pubmed/35845435 http://dx.doi.org/10.3389/frai.2022.848056
work_keys_str_mv	AT yasminromena improvingcrowdsourcingbasedimageclassificationthroughexpandedinputelicitationandmachinelearning AT hassanmdmahmudulla improvingcrowdsourcingbasedimageclassificationthroughexpandedinputelicitationandmachinelearning AT grasseljoshuat improvingcrowdsourcingbasedimageclassificationthroughexpandedinputelicitationandmachinelearning AT bhogarajuharika improvingcrowdsourcingbasedimageclassificationthroughexpandedinputelicitationandmachinelearning AT escobedoadolfor improvingcrowdsourcingbasedimageclassificationthroughexpandedinputelicitationandmachinelearning AT fuentesolac improvingcrowdsourcingbasedimageclassificationthroughexpandedinputelicitationandmachinelearning

Improving Crowdsourcing-Based Image Classification Through Expanded Input Elicitation and Machine Learning

Ejemplares similares