Cargando…
Active learning with label quality control
Training deep neural networks requires a large number of labeled samples, which are typically provided by crowdsourced workers or professionals at a high cost. To obtain qualified labels, samples need to be relabeled for inspection to control the quality of the labels, which further increases the co...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
PeerJ Inc.
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10496030/ https://www.ncbi.nlm.nih.gov/pubmed/37705638 http://dx.doi.org/10.7717/peerj-cs.1480 |
_version_ | 1785105021167206400 |
---|---|
author | Wang, Xingyu Chi, Xurong Song, Yanzhi Yang, Zhouwang |
author_facet | Wang, Xingyu Chi, Xurong Song, Yanzhi Yang, Zhouwang |
author_sort | Wang, Xingyu |
collection | PubMed |
description | Training deep neural networks requires a large number of labeled samples, which are typically provided by crowdsourced workers or professionals at a high cost. To obtain qualified labels, samples need to be relabeled for inspection to control the quality of the labels, which further increases the cost. Active learning methods aim to select the most valuable samples for labeling to reduce labeling costs. We designed a practical active learning method that adaptively allocates labeling resources to the most valuable unlabeled samples and the most likely mislabeled labeled samples, thus significantly reducing the overall labeling cost. We prove that the probability of our proposed method labeling more than one sample from any redundant sample set in the same batch is less than 1/k, where k is the number of the k-fold experiment used in the method, thus significantly reducing the labeling resources wasted on redundant samples. Our proposed method achieves the best level of results on benchmark datasets, and it performs well in an industrial application of automatic optical inspection. |
format | Online Article Text |
id | pubmed-10496030 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | PeerJ Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-104960302023-09-13 Active learning with label quality control Wang, Xingyu Chi, Xurong Song, Yanzhi Yang, Zhouwang PeerJ Comput Sci Artificial Intelligence Training deep neural networks requires a large number of labeled samples, which are typically provided by crowdsourced workers or professionals at a high cost. To obtain qualified labels, samples need to be relabeled for inspection to control the quality of the labels, which further increases the cost. Active learning methods aim to select the most valuable samples for labeling to reduce labeling costs. We designed a practical active learning method that adaptively allocates labeling resources to the most valuable unlabeled samples and the most likely mislabeled labeled samples, thus significantly reducing the overall labeling cost. We prove that the probability of our proposed method labeling more than one sample from any redundant sample set in the same batch is less than 1/k, where k is the number of the k-fold experiment used in the method, thus significantly reducing the labeling resources wasted on redundant samples. Our proposed method achieves the best level of results on benchmark datasets, and it performs well in an industrial application of automatic optical inspection. PeerJ Inc. 2023-09-08 /pmc/articles/PMC10496030/ /pubmed/37705638 http://dx.doi.org/10.7717/peerj-cs.1480 Text en © 2023 Wang et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited. |
spellingShingle | Artificial Intelligence Wang, Xingyu Chi, Xurong Song, Yanzhi Yang, Zhouwang Active learning with label quality control |
title | Active learning with label quality control |
title_full | Active learning with label quality control |
title_fullStr | Active learning with label quality control |
title_full_unstemmed | Active learning with label quality control |
title_short | Active learning with label quality control |
title_sort | active learning with label quality control |
topic | Artificial Intelligence |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10496030/ https://www.ncbi.nlm.nih.gov/pubmed/37705638 http://dx.doi.org/10.7717/peerj-cs.1480 |
work_keys_str_mv | AT wangxingyu activelearningwithlabelqualitycontrol AT chixurong activelearningwithlabelqualitycontrol AT songyanzhi activelearningwithlabelqualitycontrol AT yangzhouwang activelearningwithlabelqualitycontrol |