Cargando…

A Cluster-then-label Semi-supervised Learning Approach for Pathology Image Classification

Completely labeled pathology datasets are often challenging and time-consuming to obtain. Semi-supervised learning (SSL) methods are able to learn from fewer labeled data points with the help of a large number of unlabeled data points. In this paper, we investigated the possibility of using clusteri...

Descripción completa

Detalles Bibliográficos
Autores principales: Peikari, Mohammad, Salama, Sherine, Nofech-Mozes, Sharon, Martel, Anne L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5940864/
https://www.ncbi.nlm.nih.gov/pubmed/29739993
http://dx.doi.org/10.1038/s41598-018-24876-0
_version_ 1783321173254209536
author Peikari, Mohammad
Salama, Sherine
Nofech-Mozes, Sharon
Martel, Anne L.
author_facet Peikari, Mohammad
Salama, Sherine
Nofech-Mozes, Sharon
Martel, Anne L.
author_sort Peikari, Mohammad
collection PubMed
description Completely labeled pathology datasets are often challenging and time-consuming to obtain. Semi-supervised learning (SSL) methods are able to learn from fewer labeled data points with the help of a large number of unlabeled data points. In this paper, we investigated the possibility of using clustering analysis to identify the underlying structure of the data space for SSL. A cluster-then-label method was proposed to identify high-density regions in the data space which were then used to help a supervised SVM in finding the decision boundary. We have compared our method with other supervised and semi-supervised state-of-the-art techniques using two different classification tasks applied to breast pathology datasets. We found that compared with other state-of-the-art supervised and semi-supervised methods, our SSL method is able to improve classification performance when a limited number of labeled data instances are made available. We also showed that it is important to examine the underlying distribution of the data space before applying SSL techniques to ensure semi-supervised learning assumptions are not violated by the data.
format Online
Article
Text
id pubmed-5940864
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-59408642018-05-14 A Cluster-then-label Semi-supervised Learning Approach for Pathology Image Classification Peikari, Mohammad Salama, Sherine Nofech-Mozes, Sharon Martel, Anne L. Sci Rep Article Completely labeled pathology datasets are often challenging and time-consuming to obtain. Semi-supervised learning (SSL) methods are able to learn from fewer labeled data points with the help of a large number of unlabeled data points. In this paper, we investigated the possibility of using clustering analysis to identify the underlying structure of the data space for SSL. A cluster-then-label method was proposed to identify high-density regions in the data space which were then used to help a supervised SVM in finding the decision boundary. We have compared our method with other supervised and semi-supervised state-of-the-art techniques using two different classification tasks applied to breast pathology datasets. We found that compared with other state-of-the-art supervised and semi-supervised methods, our SSL method is able to improve classification performance when a limited number of labeled data instances are made available. We also showed that it is important to examine the underlying distribution of the data space before applying SSL techniques to ensure semi-supervised learning assumptions are not violated by the data. Nature Publishing Group UK 2018-05-08 /pmc/articles/PMC5940864/ /pubmed/29739993 http://dx.doi.org/10.1038/s41598-018-24876-0 Text en © The Author(s) 2018 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Peikari, Mohammad
Salama, Sherine
Nofech-Mozes, Sharon
Martel, Anne L.
A Cluster-then-label Semi-supervised Learning Approach for Pathology Image Classification
title A Cluster-then-label Semi-supervised Learning Approach for Pathology Image Classification
title_full A Cluster-then-label Semi-supervised Learning Approach for Pathology Image Classification
title_fullStr A Cluster-then-label Semi-supervised Learning Approach for Pathology Image Classification
title_full_unstemmed A Cluster-then-label Semi-supervised Learning Approach for Pathology Image Classification
title_short A Cluster-then-label Semi-supervised Learning Approach for Pathology Image Classification
title_sort cluster-then-label semi-supervised learning approach for pathology image classification
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5940864/
https://www.ncbi.nlm.nih.gov/pubmed/29739993
http://dx.doi.org/10.1038/s41598-018-24876-0
work_keys_str_mv AT peikarimohammad aclusterthenlabelsemisupervisedlearningapproachforpathologyimageclassification
AT salamasherine aclusterthenlabelsemisupervisedlearningapproachforpathologyimageclassification
AT nofechmozessharon aclusterthenlabelsemisupervisedlearningapproachforpathologyimageclassification
AT martelannel aclusterthenlabelsemisupervisedlearningapproachforpathologyimageclassification
AT peikarimohammad clusterthenlabelsemisupervisedlearningapproachforpathologyimageclassification
AT salamasherine clusterthenlabelsemisupervisedlearningapproachforpathologyimageclassification
AT nofechmozessharon clusterthenlabelsemisupervisedlearningapproachforpathologyimageclassification
AT martelannel clusterthenlabelsemisupervisedlearningapproachforpathologyimageclassification