Cargando…

GeNNet: an integrated platform for unifying scientific workflows and graph databases for transcriptome data analysis

There are many steps in analyzing transcriptome data, from the acquisition of raw data to the selection of a subset of representative genes that explain a scientific hypothesis. The data produced can be represented as networks of interactions among genes and these may additionally be integrated with...

Descripción completa

Detalles Bibliográficos
Autores principales: Costa, Raquel L., Gadelha, Luiz, Ribeiro-Alves, Marcelo, Porto, Fábio
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5501156/
https://www.ncbi.nlm.nih.gov/pubmed/28695067
http://dx.doi.org/10.7717/peerj.3509
_version_ 1783248755233914880
author Costa, Raquel L.
Gadelha, Luiz
Ribeiro-Alves, Marcelo
Porto, Fábio
author_facet Costa, Raquel L.
Gadelha, Luiz
Ribeiro-Alves, Marcelo
Porto, Fábio
author_sort Costa, Raquel L.
collection PubMed
description There are many steps in analyzing transcriptome data, from the acquisition of raw data to the selection of a subset of representative genes that explain a scientific hypothesis. The data produced can be represented as networks of interactions among genes and these may additionally be integrated with other biological databases, such as Protein-Protein Interactions, transcription factors and gene annotation. However, the results of these analyses remain fragmented, imposing difficulties, either for posterior inspection of results, or for meta-analysis by the incorporation of new related data. Integrating databases and tools into scientific workflows, orchestrating their execution, and managing the resulting data and its respective metadata are challenging tasks. Additionally, a great amount of effort is equally required to run in-silico experiments to structure and compose the information as needed for analysis. Different programs may need to be applied and different files are produced during the experiment cycle. In this context, the availability of a platform supporting experiment execution is paramount. We present GeNNet, an integrated transcriptome analysis platform that unifies scientific workflows with graph databases for selecting relevant genes according to the evaluated biological systems. It includes GeNNet-Wf, a scientific workflow that pre-loads biological data, pre-processes raw microarray data and conducts a series of analyses including normalization, differential expression inference, clusterization and gene set enrichment analysis. A user-friendly web interface, GeNNet-Web, allows for setting parameters, executing, and visualizing the results of GeNNet-Wf executions. To demonstrate the features of GeNNet, we performed case studies with data retrieved from GEO, particularly using a single-factor experiment in different analysis scenarios. As a result, we obtained differentially expressed genes for which biological functions were analyzed. The results are integrated into GeNNet-DB, a database about genes, clusters, experiments and their properties and relationships. The resulting graph database is explored with queries that demonstrate the expressiveness of this data model for reasoning about gene interaction networks. GeNNet is the first platform to integrate the analytical process of transcriptome data with graph databases. It provides a comprehensive set of tools that would otherwise be challenging for non-expert users to install and use. Developers can add new functionality to components of GeNNet. The derived data allows for testing previous hypotheses about an experiment and exploring new ones through the interactive graph database environment. It enables the analysis of different data on humans, rhesus, mice and rat coming from Affymetrix platforms. GeNNet is available as an open source platform at https://github.com/raquele/GeNNet and can be retrieved as a software container with the command docker pull quelopes/gennet.
format Online
Article
Text
id pubmed-5501156
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-55011562017-07-10 GeNNet: an integrated platform for unifying scientific workflows and graph databases for transcriptome data analysis Costa, Raquel L. Gadelha, Luiz Ribeiro-Alves, Marcelo Porto, Fábio PeerJ Bioinformatics There are many steps in analyzing transcriptome data, from the acquisition of raw data to the selection of a subset of representative genes that explain a scientific hypothesis. The data produced can be represented as networks of interactions among genes and these may additionally be integrated with other biological databases, such as Protein-Protein Interactions, transcription factors and gene annotation. However, the results of these analyses remain fragmented, imposing difficulties, either for posterior inspection of results, or for meta-analysis by the incorporation of new related data. Integrating databases and tools into scientific workflows, orchestrating their execution, and managing the resulting data and its respective metadata are challenging tasks. Additionally, a great amount of effort is equally required to run in-silico experiments to structure and compose the information as needed for analysis. Different programs may need to be applied and different files are produced during the experiment cycle. In this context, the availability of a platform supporting experiment execution is paramount. We present GeNNet, an integrated transcriptome analysis platform that unifies scientific workflows with graph databases for selecting relevant genes according to the evaluated biological systems. It includes GeNNet-Wf, a scientific workflow that pre-loads biological data, pre-processes raw microarray data and conducts a series of analyses including normalization, differential expression inference, clusterization and gene set enrichment analysis. A user-friendly web interface, GeNNet-Web, allows for setting parameters, executing, and visualizing the results of GeNNet-Wf executions. To demonstrate the features of GeNNet, we performed case studies with data retrieved from GEO, particularly using a single-factor experiment in different analysis scenarios. As a result, we obtained differentially expressed genes for which biological functions were analyzed. The results are integrated into GeNNet-DB, a database about genes, clusters, experiments and their properties and relationships. The resulting graph database is explored with queries that demonstrate the expressiveness of this data model for reasoning about gene interaction networks. GeNNet is the first platform to integrate the analytical process of transcriptome data with graph databases. It provides a comprehensive set of tools that would otherwise be challenging for non-expert users to install and use. Developers can add new functionality to components of GeNNet. The derived data allows for testing previous hypotheses about an experiment and exploring new ones through the interactive graph database environment. It enables the analysis of different data on humans, rhesus, mice and rat coming from Affymetrix platforms. GeNNet is available as an open source platform at https://github.com/raquele/GeNNet and can be retrieved as a software container with the command docker pull quelopes/gennet. PeerJ Inc. 2017-07-05 /pmc/articles/PMC5501156/ /pubmed/28695067 http://dx.doi.org/10.7717/peerj.3509 Text en ©2017 Costa et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Costa, Raquel L.
Gadelha, Luiz
Ribeiro-Alves, Marcelo
Porto, Fábio
GeNNet: an integrated platform for unifying scientific workflows and graph databases for transcriptome data analysis
title GeNNet: an integrated platform for unifying scientific workflows and graph databases for transcriptome data analysis
title_full GeNNet: an integrated platform for unifying scientific workflows and graph databases for transcriptome data analysis
title_fullStr GeNNet: an integrated platform for unifying scientific workflows and graph databases for transcriptome data analysis
title_full_unstemmed GeNNet: an integrated platform for unifying scientific workflows and graph databases for transcriptome data analysis
title_short GeNNet: an integrated platform for unifying scientific workflows and graph databases for transcriptome data analysis
title_sort gennet: an integrated platform for unifying scientific workflows and graph databases for transcriptome data analysis
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5501156/
https://www.ncbi.nlm.nih.gov/pubmed/28695067
http://dx.doi.org/10.7717/peerj.3509
work_keys_str_mv AT costaraquell gennetanintegratedplatformforunifyingscientificworkflowsandgraphdatabasesfortranscriptomedataanalysis
AT gadelhaluiz gennetanintegratedplatformforunifyingscientificworkflowsandgraphdatabasesfortranscriptomedataanalysis
AT ribeiroalvesmarcelo gennetanintegratedplatformforunifyingscientificworkflowsandgraphdatabasesfortranscriptomedataanalysis
AT portofabio gennetanintegratedplatformforunifyingscientificworkflowsandgraphdatabasesfortranscriptomedataanalysis