Cargando…

Deep Multimodal Clustering with Cross Reconstruction

Recently, there has been surging interests in multimodal clustering. And extracting common features plays a critical role in these methods. However, since the ignorance of the fact that data in different modalities shares similar distributions in feature space, most works did not mining the inter-mo...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Xianchao, Tang, Xiaorui, Zong, Linlin, Liu, Xinyue, Mu, Jie
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7206244/
http://dx.doi.org/10.1007/978-3-030-47426-3_24
Descripción
Sumario:Recently, there has been surging interests in multimodal clustering. And extracting common features plays a critical role in these methods. However, since the ignorance of the fact that data in different modalities shares similar distributions in feature space, most works did not mining the inter-modal distribution relationships completely, which eventually leads to unacceptable common features. To address this issue, we propose the deep multimodal clustering with cross reconstruction method, which firstly focuses on multimodal feature extraction in an unsupervised way and then clusters these extracted features. The proposed cross reconstruction aims to build latent connections among different modalities, which effectively reduces the distribution differences in feature space. The theoretical analysis shows that the cross reconstruction reduces the Wasserstein distance of multimodal feature distributions. Experimental results on six benchmark datasets demonstrate that our method achieves obviously improvement over several state-of-arts.