Cargando…

A Semisupervised Learning Scheme with Self-Paced Learning for Classifying Breast Cancer Histopathological Images

The unavailability of large amounts of well-labeled data poses a significant challenge in many medical imaging tasks. Even in the likelihood of having access to sufficient data, the process of accurately labeling the data is an arduous and time-consuming one, requiring expertise skills. Again, the i...

Descripción completa

Detalles Bibliográficos
Autores principales: Asare, Sarpong Kwadwo, You, Fei, Nartey, Obed Tettey
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7738795/
https://www.ncbi.nlm.nih.gov/pubmed/33376479
http://dx.doi.org/10.1155/2020/8826568
_version_ 1783623196356902912
author Asare, Sarpong Kwadwo
You, Fei
Nartey, Obed Tettey
author_facet Asare, Sarpong Kwadwo
You, Fei
Nartey, Obed Tettey
author_sort Asare, Sarpong Kwadwo
collection PubMed
description The unavailability of large amounts of well-labeled data poses a significant challenge in many medical imaging tasks. Even in the likelihood of having access to sufficient data, the process of accurately labeling the data is an arduous and time-consuming one, requiring expertise skills. Again, the issue of unbalanced data further compounds the abovementioned problems and presents a considerable challenge for many machine learning algorithms. In lieu of this, the ability to develop algorithms that can exploit large amounts of unlabeled data together with a small amount of labeled data, while demonstrating robustness to data imbalance, can offer promising prospects in building highly efficient classifiers. This work proposes a semisupervised learning method that integrates self-training and self-paced learning to generate and select pseudolabeled samples for classifying breast cancer histopathological images. A novel pseudolabel generation and selection algorithm is introduced in the learning scheme to generate and select highly confident pseudolabeled samples from both well-represented classes to less-represented classes. Such a learning approach improves the performance by jointly learning a model and optimizing the generation of pseudolabels on unlabeled-target data to augment the training data and retraining the model with the generated labels. A class balancing framework that normalizes the class-wise confidence scores is also proposed to prevent the model from ignoring samples from less represented classes (hard-to-learn samples), hence effectively handling the issue of data imbalance. Extensive experimental evaluation of the proposed method on the BreakHis dataset demonstrates the effectiveness of the proposed method.
format Online
Article
Text
id pubmed-7738795
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-77387952020-12-28 A Semisupervised Learning Scheme with Self-Paced Learning for Classifying Breast Cancer Histopathological Images Asare, Sarpong Kwadwo You, Fei Nartey, Obed Tettey Comput Intell Neurosci Research Article The unavailability of large amounts of well-labeled data poses a significant challenge in many medical imaging tasks. Even in the likelihood of having access to sufficient data, the process of accurately labeling the data is an arduous and time-consuming one, requiring expertise skills. Again, the issue of unbalanced data further compounds the abovementioned problems and presents a considerable challenge for many machine learning algorithms. In lieu of this, the ability to develop algorithms that can exploit large amounts of unlabeled data together with a small amount of labeled data, while demonstrating robustness to data imbalance, can offer promising prospects in building highly efficient classifiers. This work proposes a semisupervised learning method that integrates self-training and self-paced learning to generate and select pseudolabeled samples for classifying breast cancer histopathological images. A novel pseudolabel generation and selection algorithm is introduced in the learning scheme to generate and select highly confident pseudolabeled samples from both well-represented classes to less-represented classes. Such a learning approach improves the performance by jointly learning a model and optimizing the generation of pseudolabels on unlabeled-target data to augment the training data and retraining the model with the generated labels. A class balancing framework that normalizes the class-wise confidence scores is also proposed to prevent the model from ignoring samples from less represented classes (hard-to-learn samples), hence effectively handling the issue of data imbalance. Extensive experimental evaluation of the proposed method on the BreakHis dataset demonstrates the effectiveness of the proposed method. Hindawi 2020-12-08 /pmc/articles/PMC7738795/ /pubmed/33376479 http://dx.doi.org/10.1155/2020/8826568 Text en Copyright © 2020 Sarpong Kwadwo Asare et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Asare, Sarpong Kwadwo
You, Fei
Nartey, Obed Tettey
A Semisupervised Learning Scheme with Self-Paced Learning for Classifying Breast Cancer Histopathological Images
title A Semisupervised Learning Scheme with Self-Paced Learning for Classifying Breast Cancer Histopathological Images
title_full A Semisupervised Learning Scheme with Self-Paced Learning for Classifying Breast Cancer Histopathological Images
title_fullStr A Semisupervised Learning Scheme with Self-Paced Learning for Classifying Breast Cancer Histopathological Images
title_full_unstemmed A Semisupervised Learning Scheme with Self-Paced Learning for Classifying Breast Cancer Histopathological Images
title_short A Semisupervised Learning Scheme with Self-Paced Learning for Classifying Breast Cancer Histopathological Images
title_sort semisupervised learning scheme with self-paced learning for classifying breast cancer histopathological images
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7738795/
https://www.ncbi.nlm.nih.gov/pubmed/33376479
http://dx.doi.org/10.1155/2020/8826568
work_keys_str_mv AT asaresarpongkwadwo asemisupervisedlearningschemewithselfpacedlearningforclassifyingbreastcancerhistopathologicalimages
AT youfei asemisupervisedlearningschemewithselfpacedlearningforclassifyingbreastcancerhistopathologicalimages
AT narteyobedtettey asemisupervisedlearningschemewithselfpacedlearningforclassifyingbreastcancerhistopathologicalimages
AT asaresarpongkwadwo semisupervisedlearningschemewithselfpacedlearningforclassifyingbreastcancerhistopathologicalimages
AT youfei semisupervisedlearningschemewithselfpacedlearningforclassifyingbreastcancerhistopathologicalimages
AT narteyobedtettey semisupervisedlearningschemewithselfpacedlearningforclassifyingbreastcancerhistopathologicalimages