Cargando…

Automated calibration of consensus weighted distance-based clustering approaches using sharp

MOTIVATION: In consensus clustering, a clustering algorithm is used in combination with a subsampling procedure to detect stable clusters. Previous studies on both simulated and real data suggest that consensus clustering outperforms native algorithms. RESULTS: We extend here consensus clustering to...

Descripción completa

Detalles Bibliográficos
Autores principales: Bodinier, Barbara, Vuckovic, Dragana, Rodrigues, Sabrina, Filippi, Sarah, Chiquet, Julien, Chadeau-Hyam, Marc
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10627366/
https://www.ncbi.nlm.nih.gov/pubmed/37847776
http://dx.doi.org/10.1093/bioinformatics/btad635
_version_ 1785131518446796800
author Bodinier, Barbara
Vuckovic, Dragana
Rodrigues, Sabrina
Filippi, Sarah
Chiquet, Julien
Chadeau-Hyam, Marc
author_facet Bodinier, Barbara
Vuckovic, Dragana
Rodrigues, Sabrina
Filippi, Sarah
Chiquet, Julien
Chadeau-Hyam, Marc
author_sort Bodinier, Barbara
collection PubMed
description MOTIVATION: In consensus clustering, a clustering algorithm is used in combination with a subsampling procedure to detect stable clusters. Previous studies on both simulated and real data suggest that consensus clustering outperforms native algorithms. RESULTS: We extend here consensus clustering to allow for attribute weighting in the calculation of pairwise distances using existing regularized approaches. We propose a procedure for the calibration of the number of clusters (and regularization parameter) by maximizing the sharp score, a novel stability score calculated directly from consensus clustering outputs, making it extremely computationally competitive. Our simulation study shows better clustering performances of (i) approaches calibrated by maximizing the sharp score compared to existing calibration scores and (ii) weighted compared to unweighted approaches in the presence of features that do not contribute to cluster definition. Application on real gene expression data measured in lung tissue reveals clear clusters corresponding to different lung cancer subtypes. AVAILABILITY AND IMPLEMENTATION: The R package sharp (version [Formula: see text] 1.4.3) is available on CRAN at https://CRAN.R-project.org/package=sharp.
format Online
Article
Text
id pubmed-10627366
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-106273662023-11-07 Automated calibration of consensus weighted distance-based clustering approaches using sharp Bodinier, Barbara Vuckovic, Dragana Rodrigues, Sabrina Filippi, Sarah Chiquet, Julien Chadeau-Hyam, Marc Bioinformatics Original Paper MOTIVATION: In consensus clustering, a clustering algorithm is used in combination with a subsampling procedure to detect stable clusters. Previous studies on both simulated and real data suggest that consensus clustering outperforms native algorithms. RESULTS: We extend here consensus clustering to allow for attribute weighting in the calculation of pairwise distances using existing regularized approaches. We propose a procedure for the calibration of the number of clusters (and regularization parameter) by maximizing the sharp score, a novel stability score calculated directly from consensus clustering outputs, making it extremely computationally competitive. Our simulation study shows better clustering performances of (i) approaches calibrated by maximizing the sharp score compared to existing calibration scores and (ii) weighted compared to unweighted approaches in the presence of features that do not contribute to cluster definition. Application on real gene expression data measured in lung tissue reveals clear clusters corresponding to different lung cancer subtypes. AVAILABILITY AND IMPLEMENTATION: The R package sharp (version [Formula: see text] 1.4.3) is available on CRAN at https://CRAN.R-project.org/package=sharp. Oxford University Press 2023-10-17 /pmc/articles/PMC10627366/ /pubmed/37847776 http://dx.doi.org/10.1093/bioinformatics/btad635 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Bodinier, Barbara
Vuckovic, Dragana
Rodrigues, Sabrina
Filippi, Sarah
Chiquet, Julien
Chadeau-Hyam, Marc
Automated calibration of consensus weighted distance-based clustering approaches using sharp
title Automated calibration of consensus weighted distance-based clustering approaches using sharp
title_full Automated calibration of consensus weighted distance-based clustering approaches using sharp
title_fullStr Automated calibration of consensus weighted distance-based clustering approaches using sharp
title_full_unstemmed Automated calibration of consensus weighted distance-based clustering approaches using sharp
title_short Automated calibration of consensus weighted distance-based clustering approaches using sharp
title_sort automated calibration of consensus weighted distance-based clustering approaches using sharp
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10627366/
https://www.ncbi.nlm.nih.gov/pubmed/37847776
http://dx.doi.org/10.1093/bioinformatics/btad635
work_keys_str_mv AT bodinierbarbara automatedcalibrationofconsensusweighteddistancebasedclusteringapproachesusingsharp
AT vuckovicdragana automatedcalibrationofconsensusweighteddistancebasedclusteringapproachesusingsharp
AT rodriguessabrina automatedcalibrationofconsensusweighteddistancebasedclusteringapproachesusingsharp
AT filippisarah automatedcalibrationofconsensusweighteddistancebasedclusteringapproachesusingsharp
AT chiquetjulien automatedcalibrationofconsensusweighteddistancebasedclusteringapproachesusingsharp
AT chadeauhyammarc automatedcalibrationofconsensusweighteddistancebasedclusteringapproachesusingsharp