Cargando…

FctClus: A Fast Clustering Algorithm for Heterogeneous Information Networks

It is important to cluster heterogeneous information networks. A fast clustering algorithm based on an approximate commute time embedding for heterogeneous information networks with a star network schema is proposed in this paper by utilizing the sparsity of heterogeneous information networks. First...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Jing, Chen, Limin, Zhang, Jianpei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4474961/
https://www.ncbi.nlm.nih.gov/pubmed/26090857
http://dx.doi.org/10.1371/journal.pone.0130086
_version_ 1782377366834642944
author Yang, Jing
Chen, Limin
Zhang, Jianpei
author_facet Yang, Jing
Chen, Limin
Zhang, Jianpei
author_sort Yang, Jing
collection PubMed
description It is important to cluster heterogeneous information networks. A fast clustering algorithm based on an approximate commute time embedding for heterogeneous information networks with a star network schema is proposed in this paper by utilizing the sparsity of heterogeneous information networks. First, a heterogeneous information network is transformed into multiple compatible bipartite graphs from the compatible point of view. Second, the approximate commute time embedding of each bipartite graph is computed using random mapping and a linear time solver. All of the indicator subsets in each embedding simultaneously determine the target dataset. Finally, a general model is formulated by these indicator subsets, and a fast algorithm is derived by simultaneously clustering all of the indicator subsets using the sum of the weighted distances for all indicators for an identical target object. The proposed fast algorithm, FctClus, is shown to be efficient and generalizable and exhibits high clustering accuracy and fast computation speed based on a theoretic analysis and experimental verification.
format Online
Article
Text
id pubmed-4474961
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-44749612015-06-30 FctClus: A Fast Clustering Algorithm for Heterogeneous Information Networks Yang, Jing Chen, Limin Zhang, Jianpei PLoS One Research Article It is important to cluster heterogeneous information networks. A fast clustering algorithm based on an approximate commute time embedding for heterogeneous information networks with a star network schema is proposed in this paper by utilizing the sparsity of heterogeneous information networks. First, a heterogeneous information network is transformed into multiple compatible bipartite graphs from the compatible point of view. Second, the approximate commute time embedding of each bipartite graph is computed using random mapping and a linear time solver. All of the indicator subsets in each embedding simultaneously determine the target dataset. Finally, a general model is formulated by these indicator subsets, and a fast algorithm is derived by simultaneously clustering all of the indicator subsets using the sum of the weighted distances for all indicators for an identical target object. The proposed fast algorithm, FctClus, is shown to be efficient and generalizable and exhibits high clustering accuracy and fast computation speed based on a theoretic analysis and experimental verification. Public Library of Science 2015-06-19 /pmc/articles/PMC4474961/ /pubmed/26090857 http://dx.doi.org/10.1371/journal.pone.0130086 Text en © 2015 Yang et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Yang, Jing
Chen, Limin
Zhang, Jianpei
FctClus: A Fast Clustering Algorithm for Heterogeneous Information Networks
title FctClus: A Fast Clustering Algorithm for Heterogeneous Information Networks
title_full FctClus: A Fast Clustering Algorithm for Heterogeneous Information Networks
title_fullStr FctClus: A Fast Clustering Algorithm for Heterogeneous Information Networks
title_full_unstemmed FctClus: A Fast Clustering Algorithm for Heterogeneous Information Networks
title_short FctClus: A Fast Clustering Algorithm for Heterogeneous Information Networks
title_sort fctclus: a fast clustering algorithm for heterogeneous information networks
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4474961/
https://www.ncbi.nlm.nih.gov/pubmed/26090857
http://dx.doi.org/10.1371/journal.pone.0130086
work_keys_str_mv AT yangjing fctclusafastclusteringalgorithmforheterogeneousinformationnetworks
AT chenlimin fctclusafastclusteringalgorithmforheterogeneousinformationnetworks
AT zhangjianpei fctclusafastclusteringalgorithmforheterogeneousinformationnetworks