Cargando…

Two-step learning for crowdsourcing data classification

Crowdsourcing learning (Bonald and Combes 2016; Dawid and Skene, J R Stat Soc: Series C (Appl Stat), 28(1):20–28 1979; Karger et al. 2011; Li et al, IEEE Trans Knowl Data Eng, 28(9):2296–2319 2016; Liu et al. 2012; Schlagwein and Bjorn-Andersen, J Assoc Inform Syst, 15(11):3 2014; Zhang et al. 2014)...

Descripción completa

Detalles Bibliográficos
Autores principales:	Yu, Hao, Li, Jiaye, Wu, Zhaojiang, Xu, Hang, Zhu, Lei
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Springer US 2022
Materias:	1168: Deep Pattern Discovery for Big Multimedia Data
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9510273/ https://www.ncbi.nlm.nih.gov/pubmed/36188185 http://dx.doi.org/10.1007/s11042-022-12793-4

_version_	1784797409851736064
author	Yu, Hao Li, Jiaye Wu, Zhaojiang Xu, Hang Zhu, Lei
author_facet	Yu, Hao Li, Jiaye Wu, Zhaojiang Xu, Hang Zhu, Lei
author_sort	Yu, Hao
collection	PubMed
description	Crowdsourcing learning (Bonald and Combes 2016; Dawid and Skene, J R Stat Soc: Series C (Appl Stat), 28(1):20–28 1979; Karger et al. 2011; Li et al, IEEE Trans Knowl Data Eng, 28(9):2296–2319 2016; Liu et al. 2012; Schlagwein and Bjorn-Andersen, J Assoc Inform Syst, 15(11):3 2014; Zhang et al. 2014) plays an increasingly important role in the era of big data (Liu et al., IEEE Trans Syst Man Cybern: Syst, 48(12): 451–2461, 2017; Zhang et al. 2014) due to its ability to easily solve large-scale data annotations (Musen et al., J Amer Med Informs Assoc, 22(6):1148–1152 2015). However, in the process of crowdsourcing learning, the uneven knowledge level of workers often leads to low accuracy of the label after marking, which brings difficulties to the subsequent processing (Edwards and Teddy 2013) and analysis of crowdsourcing data. In order to solve this problem, this paper proposes a two-step learning crowdsourced data classification algorithm, which optimizes the original label data by simultaneously considering the two issues of different worker abilities and the similarity between crowdsourced data (Kasikci et al. 2013) samples, so as to get more accurate label data. The two-step learning algorithm mainly includes two steps. Firstly, the worker’s ability to label different samples is obtained by constructing and training the worker’s ability model, and then the similarity between samples is calculated by the cosine measurement method (Muflikhah and Baharudin 2009), and finally the original label data is optimized by combining the above two results. The experimental results also show that the two-step learning classification algorithm proposed in this article has achieved better experimental results than the comparison algorithm.
format	Online Article Text
id	pubmed-9510273
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Springer US
record_format	MEDLINE/PubMed
spelling	pubmed-95102732022-09-26 Two-step learning for crowdsourcing data classification Yu, Hao Li, Jiaye Wu, Zhaojiang Xu, Hang Zhu, Lei Multimed Tools Appl 1168: Deep Pattern Discovery for Big Multimedia Data Crowdsourcing learning (Bonald and Combes 2016; Dawid and Skene, J R Stat Soc: Series C (Appl Stat), 28(1):20–28 1979; Karger et al. 2011; Li et al, IEEE Trans Knowl Data Eng, 28(9):2296–2319 2016; Liu et al. 2012; Schlagwein and Bjorn-Andersen, J Assoc Inform Syst, 15(11):3 2014; Zhang et al. 2014) plays an increasingly important role in the era of big data (Liu et al., IEEE Trans Syst Man Cybern: Syst, 48(12): 451–2461, 2017; Zhang et al. 2014) due to its ability to easily solve large-scale data annotations (Musen et al., J Amer Med Informs Assoc, 22(6):1148–1152 2015). However, in the process of crowdsourcing learning, the uneven knowledge level of workers often leads to low accuracy of the label after marking, which brings difficulties to the subsequent processing (Edwards and Teddy 2013) and analysis of crowdsourcing data. In order to solve this problem, this paper proposes a two-step learning crowdsourced data classification algorithm, which optimizes the original label data by simultaneously considering the two issues of different worker abilities and the similarity between crowdsourced data (Kasikci et al. 2013) samples, so as to get more accurate label data. The two-step learning algorithm mainly includes two steps. Firstly, the worker’s ability to label different samples is obtained by constructing and training the worker’s ability model, and then the similarity between samples is calculated by the cosine measurement method (Muflikhah and Baharudin 2009), and finally the original label data is optimized by combining the above two results. The experimental results also show that the two-step learning classification algorithm proposed in this article has achieved better experimental results than the comparison algorithm. Springer US 2022-07-09 2022 /pmc/articles/PMC9510273/ /pubmed/36188185 http://dx.doi.org/10.1007/s11042-022-12793-4 Text en © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle	1168: Deep Pattern Discovery for Big Multimedia Data Yu, Hao Li, Jiaye Wu, Zhaojiang Xu, Hang Zhu, Lei Two-step learning for crowdsourcing data classification
title	Two-step learning for crowdsourcing data classification
title_full	Two-step learning for crowdsourcing data classification
title_fullStr	Two-step learning for crowdsourcing data classification
title_full_unstemmed	Two-step learning for crowdsourcing data classification
title_short	Two-step learning for crowdsourcing data classification
title_sort	two-step learning for crowdsourcing data classification
topic	1168: Deep Pattern Discovery for Big Multimedia Data
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9510273/ https://www.ncbi.nlm.nih.gov/pubmed/36188185 http://dx.doi.org/10.1007/s11042-022-12793-4
work_keys_str_mv	AT yuhao twosteplearningforcrowdsourcingdataclassification AT lijiaye twosteplearningforcrowdsourcingdataclassification AT wuzhaojiang twosteplearningforcrowdsourcingdataclassification AT xuhang twosteplearningforcrowdsourcingdataclassification AT zhulei twosteplearningforcrowdsourcingdataclassification

Two-step learning for crowdsourcing data classification

Ejemplares similares