Cargando…

Learning by aggregating experts and filtering novices: a solution to crowdsourcing problems in bioinformatics

BACKGROUND: In many biomedical applications, there is a need for developing classification models based on noisy annotations. Recently, various methods addressed this scenario by relaying on unreliable annotations obtained from multiple sources. RESULTS: We proposed a probabilistic classification al...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zhang, Ping, Cao, Weidan, Obradovic, Zoran
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2013
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3848820/ https://www.ncbi.nlm.nih.gov/pubmed/24268030 http://dx.doi.org/10.1186/1471-2105-14-S12-S5

_version_	1782293827874193408
author	Zhang, Ping Cao, Weidan Obradovic, Zoran
author_facet	Zhang, Ping Cao, Weidan Obradovic, Zoran
author_sort	Zhang, Ping
collection	PubMed
description	BACKGROUND: In many biomedical applications, there is a need for developing classification models based on noisy annotations. Recently, various methods addressed this scenario by relaying on unreliable annotations obtained from multiple sources. RESULTS: We proposed a probabilistic classification algorithm based on labels obtained by multiple noisy annotators. The new algorithm is capable of eliminating annotations provided by novice labellers and of providing a more accurate estimate of the ground truth by consensus labelling according to higher quality annotations. The approach is evaluated on text classification and prediction of protein disorder. Our study suggests that the higher levels of accuracy, effectiveness and performance can be achieved by the new method as compared to alternatives. CONCLUSIONS: The proposed method is applicable for meta-learning from multiple existing classification models and noisy annotations obtained by humans. It is particularly beneficial when many annotations are obtained by novice labellers. In addition, the proposed method can provide further characterization of each annotator that can help in developing more accurate classifiers by identifying the most competent annotators for each data instance.
format	Online Article Text
id	pubmed-3848820
institution	National Center for Biotechnology Information
language	English
publishDate	2013
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-38488202013-12-09 Learning by aggregating experts and filtering novices: a solution to crowdsourcing problems in bioinformatics Zhang, Ping Cao, Weidan Obradovic, Zoran BMC Bioinformatics Research BACKGROUND: In many biomedical applications, there is a need for developing classification models based on noisy annotations. Recently, various methods addressed this scenario by relaying on unreliable annotations obtained from multiple sources. RESULTS: We proposed a probabilistic classification algorithm based on labels obtained by multiple noisy annotators. The new algorithm is capable of eliminating annotations provided by novice labellers and of providing a more accurate estimate of the ground truth by consensus labelling according to higher quality annotations. The approach is evaluated on text classification and prediction of protein disorder. Our study suggests that the higher levels of accuracy, effectiveness and performance can be achieved by the new method as compared to alternatives. CONCLUSIONS: The proposed method is applicable for meta-learning from multiple existing classification models and noisy annotations obtained by humans. It is particularly beneficial when many annotations are obtained by novice labellers. In addition, the proposed method can provide further characterization of each annotator that can help in developing more accurate classifiers by identifying the most competent annotators for each data instance. BioMed Central 2013-09-24 /pmc/articles/PMC3848820/ /pubmed/24268030 http://dx.doi.org/10.1186/1471-2105-14-S12-S5 Text en Copyright © 2013 Zhang et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Zhang, Ping Cao, Weidan Obradovic, Zoran Learning by aggregating experts and filtering novices: a solution to crowdsourcing problems in bioinformatics
title	Learning by aggregating experts and filtering novices: a solution to crowdsourcing problems in bioinformatics
title_full	Learning by aggregating experts and filtering novices: a solution to crowdsourcing problems in bioinformatics
title_fullStr	Learning by aggregating experts and filtering novices: a solution to crowdsourcing problems in bioinformatics
title_full_unstemmed	Learning by aggregating experts and filtering novices: a solution to crowdsourcing problems in bioinformatics
title_short	Learning by aggregating experts and filtering novices: a solution to crowdsourcing problems in bioinformatics
title_sort	learning by aggregating experts and filtering novices: a solution to crowdsourcing problems in bioinformatics
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3848820/ https://www.ncbi.nlm.nih.gov/pubmed/24268030 http://dx.doi.org/10.1186/1471-2105-14-S12-S5
work_keys_str_mv	AT zhangping learningbyaggregatingexpertsandfilteringnovicesasolutiontocrowdsourcingproblemsinbioinformatics AT caoweidan learningbyaggregatingexpertsandfilteringnovicesasolutiontocrowdsourcingproblemsinbioinformatics AT obradoviczoran learningbyaggregatingexpertsandfilteringnovicesasolutiontocrowdsourcingproblemsinbioinformatics

Learning by aggregating experts and filtering novices: a solution to crowdsourcing problems in bioinformatics

Ejemplares similares