Cargando…

reval: A Python package to determine best clustering solutions with stability-based relative clustering validation

Determining the best partition for a dataset can be a challenging task because of the lack of a priori information within an unsupervised learning framework and the absence of a unique clustering validation approach to evaluate clustering solutions. Here we present reval: a Python package that lever...

Descripción completa

Detalles Bibliográficos
Autores principales: Landi, Isotta, Mandelli, Veronica, Lombardo, Michael V.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8085609/
https://www.ncbi.nlm.nih.gov/pubmed/33982023
http://dx.doi.org/10.1016/j.patter.2021.100228
_version_ 1783686378414931968
author Landi, Isotta
Mandelli, Veronica
Lombardo, Michael V.
author_facet Landi, Isotta
Mandelli, Veronica
Lombardo, Michael V.
author_sort Landi, Isotta
collection PubMed
description Determining the best partition for a dataset can be a challenging task because of the lack of a priori information within an unsupervised learning framework and the absence of a unique clustering validation approach to evaluate clustering solutions. Here we present reval: a Python package that leverages stability-based relative clustering validation methods to select best clustering solutions as the ones that replicate, via supervised learning, on unseen subsets of data. The implementation of relative validation methods can contribute to the theory of clustering by fostering new approaches for the investigation of clustering results in different situations and for different data distributions. This work aims at contributing to this effort by implementing a package that works with multiple clustering and classification algorithms, hence allowing both the automation of the labeling process and the assessment of the stability of different clustering mechanisms.
format Online
Article
Text
id pubmed-8085609
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-80856092021-05-11 reval: A Python package to determine best clustering solutions with stability-based relative clustering validation Landi, Isotta Mandelli, Veronica Lombardo, Michael V. Patterns (N Y) Descriptor Determining the best partition for a dataset can be a challenging task because of the lack of a priori information within an unsupervised learning framework and the absence of a unique clustering validation approach to evaluate clustering solutions. Here we present reval: a Python package that leverages stability-based relative clustering validation methods to select best clustering solutions as the ones that replicate, via supervised learning, on unseen subsets of data. The implementation of relative validation methods can contribute to the theory of clustering by fostering new approaches for the investigation of clustering results in different situations and for different data distributions. This work aims at contributing to this effort by implementing a package that works with multiple clustering and classification algorithms, hence allowing both the automation of the labeling process and the assessment of the stability of different clustering mechanisms. Elsevier 2021-04-02 /pmc/articles/PMC8085609/ /pubmed/33982023 http://dx.doi.org/10.1016/j.patter.2021.100228 Text en © 2021 The Authors https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Descriptor
Landi, Isotta
Mandelli, Veronica
Lombardo, Michael V.
reval: A Python package to determine best clustering solutions with stability-based relative clustering validation
title reval: A Python package to determine best clustering solutions with stability-based relative clustering validation
title_full reval: A Python package to determine best clustering solutions with stability-based relative clustering validation
title_fullStr reval: A Python package to determine best clustering solutions with stability-based relative clustering validation
title_full_unstemmed reval: A Python package to determine best clustering solutions with stability-based relative clustering validation
title_short reval: A Python package to determine best clustering solutions with stability-based relative clustering validation
title_sort reval: a python package to determine best clustering solutions with stability-based relative clustering validation
topic Descriptor
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8085609/
https://www.ncbi.nlm.nih.gov/pubmed/33982023
http://dx.doi.org/10.1016/j.patter.2021.100228
work_keys_str_mv AT landiisotta revalapythonpackagetodeterminebestclusteringsolutionswithstabilitybasedrelativeclusteringvalidation
AT mandelliveronica revalapythonpackagetodeterminebestclusteringsolutionswithstabilitybasedrelativeclusteringvalidation
AT lombardomichaelv revalapythonpackagetodeterminebestclusteringsolutionswithstabilitybasedrelativeclusteringvalidation