Cargando…

The Gaggle: An open-source software system for integrating bioinformatics software and data sources

BACKGROUND: Systems biologists work with many kinds of data, from many different sources, using a variety of software tools. Each of these tools typically excels at one type of analysis, such as of microarrays, of metabolic networks and of predicted protein structure. A crucial challenge is to combi...

Descripción completa

Detalles Bibliográficos
Autores principales: Shannon, Paul T, Reiss, David J, Bonneau, Richard, Baliga, Nitin S
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1464137/
https://www.ncbi.nlm.nih.gov/pubmed/16569235
http://dx.doi.org/10.1186/1471-2105-7-176
_version_ 1782127541420556288
author Shannon, Paul T
Reiss, David J
Bonneau, Richard
Baliga, Nitin S
author_facet Shannon, Paul T
Reiss, David J
Bonneau, Richard
Baliga, Nitin S
author_sort Shannon, Paul T
collection PubMed
description BACKGROUND: Systems biologists work with many kinds of data, from many different sources, using a variety of software tools. Each of these tools typically excels at one type of analysis, such as of microarrays, of metabolic networks and of predicted protein structure. A crucial challenge is to combine the capabilities of these (and other forthcoming) data resources and tools to create a data exploration and analysis environment that does justice to the variety and complexity of systems biology data sets. A solution to this problem should recognize that data types, formats and software in this high throughput age of biology are constantly changing. RESULTS: In this paper we describe the Gaggle -a simple, open-source Java software environment that helps to solve the problem of software and database integration. Guided by the classic software engineering strategy of separation of concerns and a policy of semantic flexibility, it integrates existing popular programs and web resources into a user-friendly, easily-extended environment. We demonstrate that four simple data types (names, matrices, networks, and associative arrays) are sufficient to bring together diverse databases and software. We highlight some capabilities of the Gaggle with an exploration of Helicobacter pylori pathogenesis genes, in which we identify a putative ricin-like protein -a discovery made possible by simultaneous data exploration using a wide range of publicly available data and a variety of popular bioinformatics software tools. CONCLUSION: We have integrated diverse databases (for example, KEGG, BioCyc, String) and software (Cytoscape, DataMatrixViewer, R statistical environment, and TIGR Microarray Expression Viewer). Through this loose coupling of diverse software and databases the Gaggle enables simultaneous exploration of experimental data (mRNA and protein abundance, protein-protein and protein-DNA interactions), functional associations (operon, chromosomal proximity, phylogenetic pattern), metabolic pathways (KEGG) and Pubmed abstracts (STRING web resource), creating an exploratory environment useful to 'web browser and spreadsheet biologists', to statistically savvy computational biologists, and those in between. The Gaggle uses Java RMI and Java Web Start technologies and can be found at .
format Text
id pubmed-1464137
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-14641372006-05-23 The Gaggle: An open-source software system for integrating bioinformatics software and data sources Shannon, Paul T Reiss, David J Bonneau, Richard Baliga, Nitin S BMC Bioinformatics Software BACKGROUND: Systems biologists work with many kinds of data, from many different sources, using a variety of software tools. Each of these tools typically excels at one type of analysis, such as of microarrays, of metabolic networks and of predicted protein structure. A crucial challenge is to combine the capabilities of these (and other forthcoming) data resources and tools to create a data exploration and analysis environment that does justice to the variety and complexity of systems biology data sets. A solution to this problem should recognize that data types, formats and software in this high throughput age of biology are constantly changing. RESULTS: In this paper we describe the Gaggle -a simple, open-source Java software environment that helps to solve the problem of software and database integration. Guided by the classic software engineering strategy of separation of concerns and a policy of semantic flexibility, it integrates existing popular programs and web resources into a user-friendly, easily-extended environment. We demonstrate that four simple data types (names, matrices, networks, and associative arrays) are sufficient to bring together diverse databases and software. We highlight some capabilities of the Gaggle with an exploration of Helicobacter pylori pathogenesis genes, in which we identify a putative ricin-like protein -a discovery made possible by simultaneous data exploration using a wide range of publicly available data and a variety of popular bioinformatics software tools. CONCLUSION: We have integrated diverse databases (for example, KEGG, BioCyc, String) and software (Cytoscape, DataMatrixViewer, R statistical environment, and TIGR Microarray Expression Viewer). Through this loose coupling of diverse software and databases the Gaggle enables simultaneous exploration of experimental data (mRNA and protein abundance, protein-protein and protein-DNA interactions), functional associations (operon, chromosomal proximity, phylogenetic pattern), metabolic pathways (KEGG) and Pubmed abstracts (STRING web resource), creating an exploratory environment useful to 'web browser and spreadsheet biologists', to statistically savvy computational biologists, and those in between. The Gaggle uses Java RMI and Java Web Start technologies and can be found at . BioMed Central 2006-03-28 /pmc/articles/PMC1464137/ /pubmed/16569235 http://dx.doi.org/10.1186/1471-2105-7-176 Text en Copyright © 2006 Shannon et al; licensee BioMed Central Ltd.
spellingShingle Software
Shannon, Paul T
Reiss, David J
Bonneau, Richard
Baliga, Nitin S
The Gaggle: An open-source software system for integrating bioinformatics software and data sources
title The Gaggle: An open-source software system for integrating bioinformatics software and data sources
title_full The Gaggle: An open-source software system for integrating bioinformatics software and data sources
title_fullStr The Gaggle: An open-source software system for integrating bioinformatics software and data sources
title_full_unstemmed The Gaggle: An open-source software system for integrating bioinformatics software and data sources
title_short The Gaggle: An open-source software system for integrating bioinformatics software and data sources
title_sort gaggle: an open-source software system for integrating bioinformatics software and data sources
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1464137/
https://www.ncbi.nlm.nih.gov/pubmed/16569235
http://dx.doi.org/10.1186/1471-2105-7-176
work_keys_str_mv AT shannonpault thegaggleanopensourcesoftwaresystemforintegratingbioinformaticssoftwareanddatasources
AT reissdavidj thegaggleanopensourcesoftwaresystemforintegratingbioinformaticssoftwareanddatasources
AT bonneaurichard thegaggleanopensourcesoftwaresystemforintegratingbioinformaticssoftwareanddatasources
AT baliganitins thegaggleanopensourcesoftwaresystemforintegratingbioinformaticssoftwareanddatasources
AT shannonpault gaggleanopensourcesoftwaresystemforintegratingbioinformaticssoftwareanddatasources
AT reissdavidj gaggleanopensourcesoftwaresystemforintegratingbioinformaticssoftwareanddatasources
AT bonneaurichard gaggleanopensourcesoftwaresystemforintegratingbioinformaticssoftwareanddatasources
AT baliganitins gaggleanopensourcesoftwaresystemforintegratingbioinformaticssoftwareanddatasources