Cargando…

Optimal clustering under uncertainty

Classical clustering algorithms typically either lack an underlying probability framework to make them predictive or focus on parameter estimation rather than defining and minimizing a notion of error. Recent work addresses these issues by developing a probabilistic framework based on the theory of...

Descripción completa

Detalles Bibliográficos
Autores principales: Dalton, Lori A., Benalcázar, Marco E., Dougherty, Edward R.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6168142/
https://www.ncbi.nlm.nih.gov/pubmed/30278063
http://dx.doi.org/10.1371/journal.pone.0204627
_version_ 1783360317459267584
author Dalton, Lori A.
Benalcázar, Marco E.
Dougherty, Edward R.
author_facet Dalton, Lori A.
Benalcázar, Marco E.
Dougherty, Edward R.
author_sort Dalton, Lori A.
collection PubMed
description Classical clustering algorithms typically either lack an underlying probability framework to make them predictive or focus on parameter estimation rather than defining and minimizing a notion of error. Recent work addresses these issues by developing a probabilistic framework based on the theory of random labeled point processes and characterizing a Bayes clusterer that minimizes the number of misclustered points. The Bayes clusterer is analogous to the Bayes classifier. Whereas determining a Bayes classifier requires full knowledge of the feature-label distribution, deriving a Bayes clusterer requires full knowledge of the point process. When uncertain of the point process, one would like to find a robust clusterer that is optimal over the uncertainty, just as one may find optimal robust classifiers with uncertain feature-label distributions. Herein, we derive an optimal robust clusterer by first finding an effective random point process that incorporates all randomness within its own probabilistic structure and from which a Bayes clusterer can be derived that provides an optimal robust clusterer relative to the uncertainty. This is analogous to the use of effective class-conditional distributions in robust classification. After evaluating the performance of robust clusterers in synthetic mixtures of Gaussians models, we apply the framework to granular imaging, where we make use of the asymptotic granulometric moment theory for granular images to relate robust clustering theory to the application.
format Online
Article
Text
id pubmed-6168142
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-61681422018-10-19 Optimal clustering under uncertainty Dalton, Lori A. Benalcázar, Marco E. Dougherty, Edward R. PLoS One Research Article Classical clustering algorithms typically either lack an underlying probability framework to make them predictive or focus on parameter estimation rather than defining and minimizing a notion of error. Recent work addresses these issues by developing a probabilistic framework based on the theory of random labeled point processes and characterizing a Bayes clusterer that minimizes the number of misclustered points. The Bayes clusterer is analogous to the Bayes classifier. Whereas determining a Bayes classifier requires full knowledge of the feature-label distribution, deriving a Bayes clusterer requires full knowledge of the point process. When uncertain of the point process, one would like to find a robust clusterer that is optimal over the uncertainty, just as one may find optimal robust classifiers with uncertain feature-label distributions. Herein, we derive an optimal robust clusterer by first finding an effective random point process that incorporates all randomness within its own probabilistic structure and from which a Bayes clusterer can be derived that provides an optimal robust clusterer relative to the uncertainty. This is analogous to the use of effective class-conditional distributions in robust classification. After evaluating the performance of robust clusterers in synthetic mixtures of Gaussians models, we apply the framework to granular imaging, where we make use of the asymptotic granulometric moment theory for granular images to relate robust clustering theory to the application. Public Library of Science 2018-10-02 /pmc/articles/PMC6168142/ /pubmed/30278063 http://dx.doi.org/10.1371/journal.pone.0204627 Text en © 2018 Dalton et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Dalton, Lori A.
Benalcázar, Marco E.
Dougherty, Edward R.
Optimal clustering under uncertainty
title Optimal clustering under uncertainty
title_full Optimal clustering under uncertainty
title_fullStr Optimal clustering under uncertainty
title_full_unstemmed Optimal clustering under uncertainty
title_short Optimal clustering under uncertainty
title_sort optimal clustering under uncertainty
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6168142/
https://www.ncbi.nlm.nih.gov/pubmed/30278063
http://dx.doi.org/10.1371/journal.pone.0204627
work_keys_str_mv AT daltonloria optimalclusteringunderuncertainty
AT benalcazarmarcoe optimalclusteringunderuncertainty
AT doughertyedwardr optimalclusteringunderuncertainty