Cargando…

SigWin-detector: a Grid-enabled workflow for discovering enriched windows of genomic features related to DNA sequences

BACKGROUND: Chromosome location is often used as a scaffold to organize genomic information in both the living cell and molecular biological research. Thus, ever-increasing amounts of data about genomic features are stored in public databases and can be readily visualized by genome browsers. To perf...

Descripción completa

Detalles Bibliográficos
Autores principales: Inda, Márcia A, van Batenburg, Marinus F, Roos, Marco, Belloum, Adam SZ, Vasunin, Dmitry, Wibisono, Adianto, van Kampen, Antoine HC, Breit, Timo M
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2533338/
https://www.ncbi.nlm.nih.gov/pubmed/18710516
http://dx.doi.org/10.1186/1756-0500-1-63
_version_ 1782159033942147072
author Inda, Márcia A
van Batenburg, Marinus F
Roos, Marco
Belloum, Adam SZ
Vasunin, Dmitry
Wibisono, Adianto
van Kampen, Antoine HC
Breit, Timo M
author_facet Inda, Márcia A
van Batenburg, Marinus F
Roos, Marco
Belloum, Adam SZ
Vasunin, Dmitry
Wibisono, Adianto
van Kampen, Antoine HC
Breit, Timo M
author_sort Inda, Márcia A
collection PubMed
description BACKGROUND: Chromosome location is often used as a scaffold to organize genomic information in both the living cell and molecular biological research. Thus, ever-increasing amounts of data about genomic features are stored in public databases and can be readily visualized by genome browsers. To perform in silico experimentation conveniently with this genomics data, biologists need tools to process and compare datasets routinely and explore the obtained results interactively. The complexity of such experimentation requires these tools to be based on an e-Science approach, hence generic, modular, and reusable. A virtual laboratory environment with workflows, workflow management systems, and Grid computation are therefore essential. FINDINGS: Here we apply an e-Science approach to develop SigWin-detector, a workflow-based tool that can detect significantly enriched windows of (genomic) features in a (DNA) sequence in a fast and reproducible way. For proof-of-principle, we utilize a biological use case to detect regions of increased and decreased gene expression (RIDGEs and anti-RIDGEs) in human transcriptome maps. We improved the original method for RIDGE detection by replacing the costly step of estimation by random sampling with a faster analytical formula for computing the distribution of the null hypothesis being tested and by developing a new algorithm for computing moving medians. SigWin-detector was developed using the WS-VLAM workflow management system and consists of several reusable modules that are linked together in a basic workflow. The configuration of this basic workflow can be adapted to satisfy the requirements of the specific in silico experiment. CONCLUSION: As we show with the results from analyses in the biological use case on RIDGEs, SigWin-detector is an efficient and reusable Grid-based tool for discovering windows enriched for features of a particular type in any sequence of values. Thus, SigWin-detector provides the proof-of-principle for the modular e-Science based concept of integrative bioinformatics experimentation.
format Text
id pubmed-2533338
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-25333382008-09-11 SigWin-detector: a Grid-enabled workflow for discovering enriched windows of genomic features related to DNA sequences Inda, Márcia A van Batenburg, Marinus F Roos, Marco Belloum, Adam SZ Vasunin, Dmitry Wibisono, Adianto van Kampen, Antoine HC Breit, Timo M BMC Res Notes Technical Note BACKGROUND: Chromosome location is often used as a scaffold to organize genomic information in both the living cell and molecular biological research. Thus, ever-increasing amounts of data about genomic features are stored in public databases and can be readily visualized by genome browsers. To perform in silico experimentation conveniently with this genomics data, biologists need tools to process and compare datasets routinely and explore the obtained results interactively. The complexity of such experimentation requires these tools to be based on an e-Science approach, hence generic, modular, and reusable. A virtual laboratory environment with workflows, workflow management systems, and Grid computation are therefore essential. FINDINGS: Here we apply an e-Science approach to develop SigWin-detector, a workflow-based tool that can detect significantly enriched windows of (genomic) features in a (DNA) sequence in a fast and reproducible way. For proof-of-principle, we utilize a biological use case to detect regions of increased and decreased gene expression (RIDGEs and anti-RIDGEs) in human transcriptome maps. We improved the original method for RIDGE detection by replacing the costly step of estimation by random sampling with a faster analytical formula for computing the distribution of the null hypothesis being tested and by developing a new algorithm for computing moving medians. SigWin-detector was developed using the WS-VLAM workflow management system and consists of several reusable modules that are linked together in a basic workflow. The configuration of this basic workflow can be adapted to satisfy the requirements of the specific in silico experiment. CONCLUSION: As we show with the results from analyses in the biological use case on RIDGEs, SigWin-detector is an efficient and reusable Grid-based tool for discovering windows enriched for features of a particular type in any sequence of values. Thus, SigWin-detector provides the proof-of-principle for the modular e-Science based concept of integrative bioinformatics experimentation. BioMed Central 2008-08-08 /pmc/articles/PMC2533338/ /pubmed/18710516 http://dx.doi.org/10.1186/1756-0500-1-63 Text en Copyright © 2008 Inda et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Technical Note
Inda, Márcia A
van Batenburg, Marinus F
Roos, Marco
Belloum, Adam SZ
Vasunin, Dmitry
Wibisono, Adianto
van Kampen, Antoine HC
Breit, Timo M
SigWin-detector: a Grid-enabled workflow for discovering enriched windows of genomic features related to DNA sequences
title SigWin-detector: a Grid-enabled workflow for discovering enriched windows of genomic features related to DNA sequences
title_full SigWin-detector: a Grid-enabled workflow for discovering enriched windows of genomic features related to DNA sequences
title_fullStr SigWin-detector: a Grid-enabled workflow for discovering enriched windows of genomic features related to DNA sequences
title_full_unstemmed SigWin-detector: a Grid-enabled workflow for discovering enriched windows of genomic features related to DNA sequences
title_short SigWin-detector: a Grid-enabled workflow for discovering enriched windows of genomic features related to DNA sequences
title_sort sigwin-detector: a grid-enabled workflow for discovering enriched windows of genomic features related to dna sequences
topic Technical Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2533338/
https://www.ncbi.nlm.nih.gov/pubmed/18710516
http://dx.doi.org/10.1186/1756-0500-1-63
work_keys_str_mv AT indamarciaa sigwindetectoragridenabledworkflowfordiscoveringenrichedwindowsofgenomicfeaturesrelatedtodnasequences
AT vanbatenburgmarinusf sigwindetectoragridenabledworkflowfordiscoveringenrichedwindowsofgenomicfeaturesrelatedtodnasequences
AT roosmarco sigwindetectoragridenabledworkflowfordiscoveringenrichedwindowsofgenomicfeaturesrelatedtodnasequences
AT belloumadamsz sigwindetectoragridenabledworkflowfordiscoveringenrichedwindowsofgenomicfeaturesrelatedtodnasequences
AT vasunindmitry sigwindetectoragridenabledworkflowfordiscoveringenrichedwindowsofgenomicfeaturesrelatedtodnasequences
AT wibisonoadianto sigwindetectoragridenabledworkflowfordiscoveringenrichedwindowsofgenomicfeaturesrelatedtodnasequences
AT vankampenantoinehc sigwindetectoragridenabledworkflowfordiscoveringenrichedwindowsofgenomicfeaturesrelatedtodnasequences
AT breittimom sigwindetectoragridenabledworkflowfordiscoveringenrichedwindowsofgenomicfeaturesrelatedtodnasequences