Cargando…

Nonparametric Clustering of Mixed Data Using Modified Chi-Squared Tests

We propose a non-parametric method to cluster mixed data containing both continuous and discrete random variables. The product space of the continuous and discrete sample space is transformed into a new product space based on adaptive quantization on the continuous part. Detection of cluster pattern...

Descripción completa

Detalles Bibliográficos
Autores principales: Xu, Yawen, Gao, Xin, Wang, Xiaogang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9778617/
https://www.ncbi.nlm.nih.gov/pubmed/36554154
http://dx.doi.org/10.3390/e24121749
Descripción
Sumario:We propose a non-parametric method to cluster mixed data containing both continuous and discrete random variables. The product space of the continuous and discrete sample space is transformed into a new product space based on adaptive quantization on the continuous part. Detection of cluster patterns on the product space is determined locally by using a weighted modified chi-squared test. Our algorithm does not require any user input since the number of clusters is determined automatically by data. Simulation studies and real data analysis results show that our proposed method outperforms the benchmark method, AutoClass, in various settings.