Cargando…

Cloud-based uniform ChIP-Seq processing tools for modENCODE and ENCODE

BACKGROUND: Funded by the National Institutes of Health (NIH), the aim of the Model Organism ENCyclopedia of DNA Elements (modENCODE) project is to provide the biological research community with a comprehensive encyclopedia of functional genomic elements for both model organisms C. elegans (worm) an...

Descripción completa

Detalles Bibliográficos
Autores principales: Trinh, Quang M, Jen, Fei-Yang Arthur, Zhou, Ziru, Chu, Kar Ming, Perry, Marc D, Kephart, Ellen T, Contrino, Sergio, Ruzanov, Peter, Stein, Lincoln D
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3734164/
https://www.ncbi.nlm.nih.gov/pubmed/23875683
http://dx.doi.org/10.1186/1471-2164-14-494
_version_ 1782279487057035264
author Trinh, Quang M
Jen, Fei-Yang Arthur
Zhou, Ziru
Chu, Kar Ming
Perry, Marc D
Kephart, Ellen T
Contrino, Sergio
Ruzanov, Peter
Stein, Lincoln D
author_facet Trinh, Quang M
Jen, Fei-Yang Arthur
Zhou, Ziru
Chu, Kar Ming
Perry, Marc D
Kephart, Ellen T
Contrino, Sergio
Ruzanov, Peter
Stein, Lincoln D
author_sort Trinh, Quang M
collection PubMed
description BACKGROUND: Funded by the National Institutes of Health (NIH), the aim of the Model Organism ENCyclopedia of DNA Elements (modENCODE) project is to provide the biological research community with a comprehensive encyclopedia of functional genomic elements for both model organisms C. elegans (worm) and D. melanogaster (fly). With a total size of just under 10 terabytes of data collected and released to the public, one of the challenges faced by researchers is to extract biologically meaningful knowledge from this large data set. While the basic quality control, pre-processing, and analysis of the data has already been performed by members of the modENCODE consortium, many researchers will wish to reinterpret the data set using modifications and enhancements of the original protocols, or combine modENCODE data with other data sets. Unfortunately this can be a time consuming and logistically challenging proposition. RESULTS: In recognition of this challenge, the modENCODE DCC has released uniform computing resources for analyzing modENCODE data on Galaxy (https://github.com/modENCODE-DCC/Galaxy), on the public Amazon Cloud (http://aws.amazon.com), and on the private Bionimbus Cloud for genomic research (http://www.bionimbus.org). In particular, we have released Galaxy workflows for interpreting ChIP-seq data which use the same quality control (QC) and peak calling standards adopted by the modENCODE and ENCODE communities. For convenience of use, we have created Amazon and Bionimbus Cloud machine images containing Galaxy along with all the modENCODE data, software and other dependencies. CONCLUSIONS: Using these resources provides a framework for running consistent and reproducible analyses on modENCODE data, ultimately allowing researchers to use more of their time using modENCODE data, and less time moving it around.
format Online
Article
Text
id pubmed-3734164
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-37341642013-08-06 Cloud-based uniform ChIP-Seq processing tools for modENCODE and ENCODE Trinh, Quang M Jen, Fei-Yang Arthur Zhou, Ziru Chu, Kar Ming Perry, Marc D Kephart, Ellen T Contrino, Sergio Ruzanov, Peter Stein, Lincoln D BMC Genomics Software BACKGROUND: Funded by the National Institutes of Health (NIH), the aim of the Model Organism ENCyclopedia of DNA Elements (modENCODE) project is to provide the biological research community with a comprehensive encyclopedia of functional genomic elements for both model organisms C. elegans (worm) and D. melanogaster (fly). With a total size of just under 10 terabytes of data collected and released to the public, one of the challenges faced by researchers is to extract biologically meaningful knowledge from this large data set. While the basic quality control, pre-processing, and analysis of the data has already been performed by members of the modENCODE consortium, many researchers will wish to reinterpret the data set using modifications and enhancements of the original protocols, or combine modENCODE data with other data sets. Unfortunately this can be a time consuming and logistically challenging proposition. RESULTS: In recognition of this challenge, the modENCODE DCC has released uniform computing resources for analyzing modENCODE data on Galaxy (https://github.com/modENCODE-DCC/Galaxy), on the public Amazon Cloud (http://aws.amazon.com), and on the private Bionimbus Cloud for genomic research (http://www.bionimbus.org). In particular, we have released Galaxy workflows for interpreting ChIP-seq data which use the same quality control (QC) and peak calling standards adopted by the modENCODE and ENCODE communities. For convenience of use, we have created Amazon and Bionimbus Cloud machine images containing Galaxy along with all the modENCODE data, software and other dependencies. CONCLUSIONS: Using these resources provides a framework for running consistent and reproducible analyses on modENCODE data, ultimately allowing researchers to use more of their time using modENCODE data, and less time moving it around. BioMed Central 2013-07-22 /pmc/articles/PMC3734164/ /pubmed/23875683 http://dx.doi.org/10.1186/1471-2164-14-494 Text en Copyright © 2013 Trinh et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software
Trinh, Quang M
Jen, Fei-Yang Arthur
Zhou, Ziru
Chu, Kar Ming
Perry, Marc D
Kephart, Ellen T
Contrino, Sergio
Ruzanov, Peter
Stein, Lincoln D
Cloud-based uniform ChIP-Seq processing tools for modENCODE and ENCODE
title Cloud-based uniform ChIP-Seq processing tools for modENCODE and ENCODE
title_full Cloud-based uniform ChIP-Seq processing tools for modENCODE and ENCODE
title_fullStr Cloud-based uniform ChIP-Seq processing tools for modENCODE and ENCODE
title_full_unstemmed Cloud-based uniform ChIP-Seq processing tools for modENCODE and ENCODE
title_short Cloud-based uniform ChIP-Seq processing tools for modENCODE and ENCODE
title_sort cloud-based uniform chip-seq processing tools for modencode and encode
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3734164/
https://www.ncbi.nlm.nih.gov/pubmed/23875683
http://dx.doi.org/10.1186/1471-2164-14-494
work_keys_str_mv AT trinhquangm cloudbaseduniformchipseqprocessingtoolsformodencodeandencode
AT jenfeiyangarthur cloudbaseduniformchipseqprocessingtoolsformodencodeandencode
AT zhouziru cloudbaseduniformchipseqprocessingtoolsformodencodeandencode
AT chukarming cloudbaseduniformchipseqprocessingtoolsformodencodeandencode
AT perrymarcd cloudbaseduniformchipseqprocessingtoolsformodencodeandencode
AT kephartellent cloudbaseduniformchipseqprocessingtoolsformodencodeandencode
AT contrinosergio cloudbaseduniformchipseqprocessingtoolsformodencodeandencode
AT ruzanovpeter cloudbaseduniformchipseqprocessingtoolsformodencodeandencode
AT steinlincolnd cloudbaseduniformchipseqprocessingtoolsformodencodeandencode