Cargando…
How the Outliers Influence the Quality of Clustering?
In this article, we evaluate the efficiency and performance of two clustering algorithms: [Formula: see text] (Agglomerative Hierarchical Clustering) and [Formula: see text]. We are aware that there are various linkage options and distance measures that influence the clustering results. We assess th...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9324173/ https://www.ncbi.nlm.nih.gov/pubmed/35885141 http://dx.doi.org/10.3390/e24070917 |
Sumario: | In this article, we evaluate the efficiency and performance of two clustering algorithms: [Formula: see text] (Agglomerative Hierarchical Clustering) and [Formula: see text]. We are aware that there are various linkage options and distance measures that influence the clustering results. We assess the quality of clustering using the Davies–Bouldin and Dunn cluster validity indexes. The main contribution of this research is to verify whether the quality of clusters without outliers is higher than those with outliers in the data. To do this, we compare and analyze outlier detection algorithms depending on the applied clustering algorithm. In our research, we use and compare the [Formula: see text] (Local Outlier Factor) and [Formula: see text] (Connectivity-based Outlier Factor) algorithms for detecting outliers before and after removing [Formula: see text] , [Formula: see text] , and [Formula: see text] of outliers. Next, we analyze how the quality of clustering has improved. In the experiments, three real data sets were used with a different number of instances. |
---|