Cargando…

Coral: an integrated suite of visualizations for comparing clusterings

BACKGROUND: Clustering has become a standard analysis for many types of biological data (e.g interaction networks, gene expression, metagenomic abundance). In practice, it is possible to obtain a large number of contradictory clusterings by varying which clustering algorithm is used, which data attr...

Descripción completa

Detalles Bibliográficos
Autores principales: Filippova, Darya, Gadani, Aashish, Kingsford, Carl
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3576325/
https://www.ncbi.nlm.nih.gov/pubmed/23102108
http://dx.doi.org/10.1186/1471-2105-13-276
_version_ 1782259839640010752
author Filippova, Darya
Gadani, Aashish
Kingsford, Carl
author_facet Filippova, Darya
Gadani, Aashish
Kingsford, Carl
author_sort Filippova, Darya
collection PubMed
description BACKGROUND: Clustering has become a standard analysis for many types of biological data (e.g interaction networks, gene expression, metagenomic abundance). In practice, it is possible to obtain a large number of contradictory clusterings by varying which clustering algorithm is used, which data attributes are considered, how algorithmic parameters are set, and which near-optimal clusterings are chosen. It is a difficult task to sift though such a large collection of varied clusterings to determine which clustering features are affected by parameter settings or are artifacts of particular algorithms and which represent meaningful patterns. Knowing which items are often clustered together helps to improve our understanding of the underlying data and to increase our confidence about generated modules. RESULTS: We present Coral, an application for interactive exploration of large ensembles of clusterings. Coral makes all-to-all clustering comparison easy, supports exploration of individual clusterings, allows tracking modules across clusterings, and supports identification of core and peripheral items in modules. We discuss how each visual component in Coral tackles a specific question related to clustering comparison and provide examples of their use. We also show how Coral could be used to visually and quantitatively compare clusterings with a ground truth clustering. CONCLUSION: As a case study, we compare clusterings of a recently published protein interaction network of Arabidopsis thaliana. We use several popular algorithms to generate the network’s clusterings. We find that the clusterings vary significantly and that few proteins are consistently co-clustered in all clusterings. This is evidence that several clusterings should typically be considered when evaluating modules of genes, proteins, or sequences, and Coral can be used to perform a comprehensive analysis of these clustering ensembles.
format Online
Article
Text
id pubmed-3576325
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35763252013-02-22 Coral: an integrated suite of visualizations for comparing clusterings Filippova, Darya Gadani, Aashish Kingsford, Carl BMC Bioinformatics Software BACKGROUND: Clustering has become a standard analysis for many types of biological data (e.g interaction networks, gene expression, metagenomic abundance). In practice, it is possible to obtain a large number of contradictory clusterings by varying which clustering algorithm is used, which data attributes are considered, how algorithmic parameters are set, and which near-optimal clusterings are chosen. It is a difficult task to sift though such a large collection of varied clusterings to determine which clustering features are affected by parameter settings or are artifacts of particular algorithms and which represent meaningful patterns. Knowing which items are often clustered together helps to improve our understanding of the underlying data and to increase our confidence about generated modules. RESULTS: We present Coral, an application for interactive exploration of large ensembles of clusterings. Coral makes all-to-all clustering comparison easy, supports exploration of individual clusterings, allows tracking modules across clusterings, and supports identification of core and peripheral items in modules. We discuss how each visual component in Coral tackles a specific question related to clustering comparison and provide examples of their use. We also show how Coral could be used to visually and quantitatively compare clusterings with a ground truth clustering. CONCLUSION: As a case study, we compare clusterings of a recently published protein interaction network of Arabidopsis thaliana. We use several popular algorithms to generate the network’s clusterings. We find that the clusterings vary significantly and that few proteins are consistently co-clustered in all clusterings. This is evidence that several clusterings should typically be considered when evaluating modules of genes, proteins, or sequences, and Coral can be used to perform a comprehensive analysis of these clustering ensembles. BioMed Central 2012-10-29 /pmc/articles/PMC3576325/ /pubmed/23102108 http://dx.doi.org/10.1186/1471-2105-13-276 Text en Copyright ©2012 Fillippova et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software
Filippova, Darya
Gadani, Aashish
Kingsford, Carl
Coral: an integrated suite of visualizations for comparing clusterings
title Coral: an integrated suite of visualizations for comparing clusterings
title_full Coral: an integrated suite of visualizations for comparing clusterings
title_fullStr Coral: an integrated suite of visualizations for comparing clusterings
title_full_unstemmed Coral: an integrated suite of visualizations for comparing clusterings
title_short Coral: an integrated suite of visualizations for comparing clusterings
title_sort coral: an integrated suite of visualizations for comparing clusterings
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3576325/
https://www.ncbi.nlm.nih.gov/pubmed/23102108
http://dx.doi.org/10.1186/1471-2105-13-276
work_keys_str_mv AT filippovadarya coralanintegratedsuiteofvisualizationsforcomparingclusterings
AT gadaniaashish coralanintegratedsuiteofvisualizationsforcomparingclusterings
AT kingsfordcarl coralanintegratedsuiteofvisualizationsforcomparingclusterings