Cargando…

PFClust: an optimised implementation of a parameter-free clustering algorithm

BACKGROUND: A well-known problem in cluster analysis is finding an optimal number of clusters reflecting the inherent structure of the data. PFClust is a partitioning-based clustering algorithm capable, unlike many widely-used clustering algorithms, of automatically proposing an optimal number of cl...

Descripción completa

Detalles Bibliográficos
Autores principales: Musayeva, Khadija, Henderson, Tristan, Mitchell, John BO, Mavridis, Lazaros
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3940029/
https://www.ncbi.nlm.nih.gov/pubmed/24490618
http://dx.doi.org/10.1186/1751-0473-9-5
Descripción
Sumario:BACKGROUND: A well-known problem in cluster analysis is finding an optimal number of clusters reflecting the inherent structure of the data. PFClust is a partitioning-based clustering algorithm capable, unlike many widely-used clustering algorithms, of automatically proposing an optimal number of clusters for the data. RESULTS: The results of tests on various types of data showed that PFClust can discover clusters of arbitrary shapes, sizes and densities. The previous implementation of the algorithm had already been successfully used to cluster large macromolecular structures and small druglike compounds. We have greatly improved the algorithm by a more efficient implementation, which enables PFClust to process large data sets acceptably fast. CONCLUSIONS: In this paper we present a new optimized implementation of the PFClust algorithm that runs considerably faster than the original.