Cargando…

Multiple co-clustering based on nonparametric mixture models with heterogeneous marginal distributions

We propose a novel method for multiple clustering, which is useful for analysis of high-dimensional data containing heterogeneous types of features. Our method is based on nonparametric Bayesian mixture models in which features are automatically partitioned (into views) for each clustering solution....

Descripción completa

Detalles Bibliográficos
Autores principales: Tokuda, Tomoki, Yoshimoto, Junichiro, Shimizu, Yu, Okada, Go, Takamura, Masahiro, Okamoto, Yasumasa, Yamawaki, Shigeto, Doya, Kenji
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5648298/
https://www.ncbi.nlm.nih.gov/pubmed/29049392
http://dx.doi.org/10.1371/journal.pone.0186566
_version_ 1783272371184992256
author Tokuda, Tomoki
Yoshimoto, Junichiro
Shimizu, Yu
Okada, Go
Takamura, Masahiro
Okamoto, Yasumasa
Yamawaki, Shigeto
Doya, Kenji
author_facet Tokuda, Tomoki
Yoshimoto, Junichiro
Shimizu, Yu
Okada, Go
Takamura, Masahiro
Okamoto, Yasumasa
Yamawaki, Shigeto
Doya, Kenji
author_sort Tokuda, Tomoki
collection PubMed
description We propose a novel method for multiple clustering, which is useful for analysis of high-dimensional data containing heterogeneous types of features. Our method is based on nonparametric Bayesian mixture models in which features are automatically partitioned (into views) for each clustering solution. This feature partition works as feature selection for a particular clustering solution, which screens out irrelevant features. To make our method applicable to high-dimensional data, a co-clustering structure is newly introduced for each view. Further, the outstanding novelty of our method is that we simultaneously model different distribution families, such as Gaussian, Poisson, and multinomial distributions in each cluster block, which widens areas of application to real data. We apply the proposed method to synthetic and real data, and show that our method outperforms other multiple clustering methods both in recovering true cluster structures and in computation time. Finally, we apply our method to a depression dataset with no true cluster structure available, from which useful inferences are drawn about possible clustering structures of the data.
format Online
Article
Text
id pubmed-5648298
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-56482982017-11-03 Multiple co-clustering based on nonparametric mixture models with heterogeneous marginal distributions Tokuda, Tomoki Yoshimoto, Junichiro Shimizu, Yu Okada, Go Takamura, Masahiro Okamoto, Yasumasa Yamawaki, Shigeto Doya, Kenji PLoS One Research Article We propose a novel method for multiple clustering, which is useful for analysis of high-dimensional data containing heterogeneous types of features. Our method is based on nonparametric Bayesian mixture models in which features are automatically partitioned (into views) for each clustering solution. This feature partition works as feature selection for a particular clustering solution, which screens out irrelevant features. To make our method applicable to high-dimensional data, a co-clustering structure is newly introduced for each view. Further, the outstanding novelty of our method is that we simultaneously model different distribution families, such as Gaussian, Poisson, and multinomial distributions in each cluster block, which widens areas of application to real data. We apply the proposed method to synthetic and real data, and show that our method outperforms other multiple clustering methods both in recovering true cluster structures and in computation time. Finally, we apply our method to a depression dataset with no true cluster structure available, from which useful inferences are drawn about possible clustering structures of the data. Public Library of Science 2017-10-19 /pmc/articles/PMC5648298/ /pubmed/29049392 http://dx.doi.org/10.1371/journal.pone.0186566 Text en © 2017 Tokuda et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Tokuda, Tomoki
Yoshimoto, Junichiro
Shimizu, Yu
Okada, Go
Takamura, Masahiro
Okamoto, Yasumasa
Yamawaki, Shigeto
Doya, Kenji
Multiple co-clustering based on nonparametric mixture models with heterogeneous marginal distributions
title Multiple co-clustering based on nonparametric mixture models with heterogeneous marginal distributions
title_full Multiple co-clustering based on nonparametric mixture models with heterogeneous marginal distributions
title_fullStr Multiple co-clustering based on nonparametric mixture models with heterogeneous marginal distributions
title_full_unstemmed Multiple co-clustering based on nonparametric mixture models with heterogeneous marginal distributions
title_short Multiple co-clustering based on nonparametric mixture models with heterogeneous marginal distributions
title_sort multiple co-clustering based on nonparametric mixture models with heterogeneous marginal distributions
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5648298/
https://www.ncbi.nlm.nih.gov/pubmed/29049392
http://dx.doi.org/10.1371/journal.pone.0186566
work_keys_str_mv AT tokudatomoki multiplecoclusteringbasedonnonparametricmixturemodelswithheterogeneousmarginaldistributions
AT yoshimotojunichiro multiplecoclusteringbasedonnonparametricmixturemodelswithheterogeneousmarginaldistributions
AT shimizuyu multiplecoclusteringbasedonnonparametricmixturemodelswithheterogeneousmarginaldistributions
AT okadago multiplecoclusteringbasedonnonparametricmixturemodelswithheterogeneousmarginaldistributions
AT takamuramasahiro multiplecoclusteringbasedonnonparametricmixturemodelswithheterogeneousmarginaldistributions
AT okamotoyasumasa multiplecoclusteringbasedonnonparametricmixturemodelswithheterogeneousmarginaldistributions
AT yamawakishigeto multiplecoclusteringbasedonnonparametricmixturemodelswithheterogeneousmarginaldistributions
AT doyakenji multiplecoclusteringbasedonnonparametricmixturemodelswithheterogeneousmarginaldistributions