Cargando…

The BioDICE Taverna plugin for clustering and visualization of biological data: a workflow for molecular compounds exploration

BACKGROUND: In many experimental pipelines, clustering of multidimensional biological datasets is used to detect hidden structures in unlabelled input data. Taverna is a popular workflow management system that is used to design and execute scientific workflows and aid in silico experimentation. The...

Descripción completa

Detalles Bibliográficos
Autores principales: Fiannaca, Antonino, Rosa, Massimo La, Fatta, Giuseppe Di, Gaglio, Salvatore, Rizzo, Riccardo, Urso, Alfonso
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4036106/
http://dx.doi.org/10.1186/1758-2946-6-24
_version_ 1782318134653353984
author Fiannaca, Antonino
Rosa, Massimo La
Fatta, Giuseppe Di
Gaglio, Salvatore
Rizzo, Riccardo
Urso, Alfonso
author_facet Fiannaca, Antonino
Rosa, Massimo La
Fatta, Giuseppe Di
Gaglio, Salvatore
Rizzo, Riccardo
Urso, Alfonso
author_sort Fiannaca, Antonino
collection PubMed
description BACKGROUND: In many experimental pipelines, clustering of multidimensional biological datasets is used to detect hidden structures in unlabelled input data. Taverna is a popular workflow management system that is used to design and execute scientific workflows and aid in silico experimentation. The availability of fast unsupervised methods for clustering and visualization in the Taverna platform is important to support a data-driven scientific discovery in complex and explorative bioinformatics applications. RESULTS: This work presents a Taverna plugin, the Biological Data Interactive Clustering Explorer (BioDICE), that performs clustering of high-dimensional biological data and provides a nonlinear, topology preserving projection for the visualization of the input data and their similarities. The core algorithm in the BioDICE plugin is Fast Learning Self Organizing Map (FLSOM), which is an improved variant of the Self Organizing Map (SOM) algorithm. The plugin generates an interactive 2D map that allows the visual exploration of multidimensional data and the identification of groups of similar objects. The effectiveness of the plugin is demonstrated on a case study related to chemical compounds. CONCLUSIONS: The number and variety of available tools and its extensibility have made Taverna a popular choice for the development of scientific data workflows. This work presents a novel plugin, BioDICE, which adds a data-driven knowledge discovery component to Taverna. BioDICE provides an effective and powerful clustering tool, which can be adopted for the explorative analysis of biological datasets.
format Online
Article
Text
id pubmed-4036106
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-40361062014-05-29 The BioDICE Taverna plugin for clustering and visualization of biological data: a workflow for molecular compounds exploration Fiannaca, Antonino Rosa, Massimo La Fatta, Giuseppe Di Gaglio, Salvatore Rizzo, Riccardo Urso, Alfonso J Cheminform Software BACKGROUND: In many experimental pipelines, clustering of multidimensional biological datasets is used to detect hidden structures in unlabelled input data. Taverna is a popular workflow management system that is used to design and execute scientific workflows and aid in silico experimentation. The availability of fast unsupervised methods for clustering and visualization in the Taverna platform is important to support a data-driven scientific discovery in complex and explorative bioinformatics applications. RESULTS: This work presents a Taverna plugin, the Biological Data Interactive Clustering Explorer (BioDICE), that performs clustering of high-dimensional biological data and provides a nonlinear, topology preserving projection for the visualization of the input data and their similarities. The core algorithm in the BioDICE plugin is Fast Learning Self Organizing Map (FLSOM), which is an improved variant of the Self Organizing Map (SOM) algorithm. The plugin generates an interactive 2D map that allows the visual exploration of multidimensional data and the identification of groups of similar objects. The effectiveness of the plugin is demonstrated on a case study related to chemical compounds. CONCLUSIONS: The number and variety of available tools and its extensibility have made Taverna a popular choice for the development of scientific data workflows. This work presents a novel plugin, BioDICE, which adds a data-driven knowledge discovery component to Taverna. BioDICE provides an effective and powerful clustering tool, which can be adopted for the explorative analysis of biological datasets. BioMed Central 2014-05-13 /pmc/articles/PMC4036106/ http://dx.doi.org/10.1186/1758-2946-6-24 Text en Copyright © 2014 Fiannaca et al.; licensee Chemistry Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Fiannaca, Antonino
Rosa, Massimo La
Fatta, Giuseppe Di
Gaglio, Salvatore
Rizzo, Riccardo
Urso, Alfonso
The BioDICE Taverna plugin for clustering and visualization of biological data: a workflow for molecular compounds exploration
title The BioDICE Taverna plugin for clustering and visualization of biological data: a workflow for molecular compounds exploration
title_full The BioDICE Taverna plugin for clustering and visualization of biological data: a workflow for molecular compounds exploration
title_fullStr The BioDICE Taverna plugin for clustering and visualization of biological data: a workflow for molecular compounds exploration
title_full_unstemmed The BioDICE Taverna plugin for clustering and visualization of biological data: a workflow for molecular compounds exploration
title_short The BioDICE Taverna plugin for clustering and visualization of biological data: a workflow for molecular compounds exploration
title_sort biodice taverna plugin for clustering and visualization of biological data: a workflow for molecular compounds exploration
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4036106/
http://dx.doi.org/10.1186/1758-2946-6-24
work_keys_str_mv AT fiannacaantonino thebiodicetavernapluginforclusteringandvisualizationofbiologicaldataaworkflowformolecularcompoundsexploration
AT rosamassimola thebiodicetavernapluginforclusteringandvisualizationofbiologicaldataaworkflowformolecularcompoundsexploration
AT fattagiuseppedi thebiodicetavernapluginforclusteringandvisualizationofbiologicaldataaworkflowformolecularcompoundsexploration
AT gagliosalvatore thebiodicetavernapluginforclusteringandvisualizationofbiologicaldataaworkflowformolecularcompoundsexploration
AT rizzoriccardo thebiodicetavernapluginforclusteringandvisualizationofbiologicaldataaworkflowformolecularcompoundsexploration
AT ursoalfonso thebiodicetavernapluginforclusteringandvisualizationofbiologicaldataaworkflowformolecularcompoundsexploration
AT fiannacaantonino biodicetavernapluginforclusteringandvisualizationofbiologicaldataaworkflowformolecularcompoundsexploration
AT rosamassimola biodicetavernapluginforclusteringandvisualizationofbiologicaldataaworkflowformolecularcompoundsexploration
AT fattagiuseppedi biodicetavernapluginforclusteringandvisualizationofbiologicaldataaworkflowformolecularcompoundsexploration
AT gagliosalvatore biodicetavernapluginforclusteringandvisualizationofbiologicaldataaworkflowformolecularcompoundsexploration
AT rizzoriccardo biodicetavernapluginforclusteringandvisualizationofbiologicaldataaworkflowformolecularcompoundsexploration
AT ursoalfonso biodicetavernapluginforclusteringandvisualizationofbiologicaldataaworkflowformolecularcompoundsexploration