Cargando…

Nonparametric Clustering of Mixed Data Using Modified Chi-Squared Tests

We propose a non-parametric method to cluster mixed data containing both continuous and discrete random variables. The product space of the continuous and discrete sample space is transformed into a new product space based on adaptive quantization on the continuous part. Detection of cluster pattern...

Descripción completa

Detalles Bibliográficos
Autores principales: Xu, Yawen, Gao, Xin, Wang, Xiaogang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9778617/
https://www.ncbi.nlm.nih.gov/pubmed/36554154
http://dx.doi.org/10.3390/e24121749
_version_ 1784856407001006080
author Xu, Yawen
Gao, Xin
Wang, Xiaogang
author_facet Xu, Yawen
Gao, Xin
Wang, Xiaogang
author_sort Xu, Yawen
collection PubMed
description We propose a non-parametric method to cluster mixed data containing both continuous and discrete random variables. The product space of the continuous and discrete sample space is transformed into a new product space based on adaptive quantization on the continuous part. Detection of cluster patterns on the product space is determined locally by using a weighted modified chi-squared test. Our algorithm does not require any user input since the number of clusters is determined automatically by data. Simulation studies and real data analysis results show that our proposed method outperforms the benchmark method, AutoClass, in various settings.
format Online
Article
Text
id pubmed-9778617
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-97786172022-12-23 Nonparametric Clustering of Mixed Data Using Modified Chi-Squared Tests Xu, Yawen Gao, Xin Wang, Xiaogang Entropy (Basel) Article We propose a non-parametric method to cluster mixed data containing both continuous and discrete random variables. The product space of the continuous and discrete sample space is transformed into a new product space based on adaptive quantization on the continuous part. Detection of cluster patterns on the product space is determined locally by using a weighted modified chi-squared test. Our algorithm does not require any user input since the number of clusters is determined automatically by data. Simulation studies and real data analysis results show that our proposed method outperforms the benchmark method, AutoClass, in various settings. MDPI 2022-11-29 /pmc/articles/PMC9778617/ /pubmed/36554154 http://dx.doi.org/10.3390/e24121749 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Xu, Yawen
Gao, Xin
Wang, Xiaogang
Nonparametric Clustering of Mixed Data Using Modified Chi-Squared Tests
title Nonparametric Clustering of Mixed Data Using Modified Chi-Squared Tests
title_full Nonparametric Clustering of Mixed Data Using Modified Chi-Squared Tests
title_fullStr Nonparametric Clustering of Mixed Data Using Modified Chi-Squared Tests
title_full_unstemmed Nonparametric Clustering of Mixed Data Using Modified Chi-Squared Tests
title_short Nonparametric Clustering of Mixed Data Using Modified Chi-Squared Tests
title_sort nonparametric clustering of mixed data using modified chi-squared tests
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9778617/
https://www.ncbi.nlm.nih.gov/pubmed/36554154
http://dx.doi.org/10.3390/e24121749
work_keys_str_mv AT xuyawen nonparametricclusteringofmixeddatausingmodifiedchisquaredtests
AT gaoxin nonparametricclusteringofmixeddatausingmodifiedchisquaredtests
AT wangxiaogang nonparametricclusteringofmixeddatausingmodifiedchisquaredtests