Cargando…
RACS: rapid analysis of ChIP-Seq data for contig based genomes
BACKGROUND: Chromatin immunoprecipitation coupled to next generation sequencing (ChIP-Seq) is a widely-used molecular method to investigate the function of chromatin-related proteins by identifying their associated DNA sequences on a genomic scale. ChIP-Seq generates large quantities of data that is...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6819487/ https://www.ncbi.nlm.nih.gov/pubmed/31664892 http://dx.doi.org/10.1186/s12859-019-3100-2 |
_version_ | 1783463742031265792 |
---|---|
author | Saettone, Alejandro Ponce, Marcelo Nabeel-Shah, Syed Fillingham, Jeffrey |
author_facet | Saettone, Alejandro Ponce, Marcelo Nabeel-Shah, Syed Fillingham, Jeffrey |
author_sort | Saettone, Alejandro |
collection | PubMed |
description | BACKGROUND: Chromatin immunoprecipitation coupled to next generation sequencing (ChIP-Seq) is a widely-used molecular method to investigate the function of chromatin-related proteins by identifying their associated DNA sequences on a genomic scale. ChIP-Seq generates large quantities of data that is difficult to process and analyze, particularly for organisms with a contig-based sequenced genomes that typically have minimal annotation on their associated set of genes other than their associated coordinates primarily predicted by gene finding programs. Poorly annotated genome sequence makes comprehensive analysis of ChIP-Seq data difficult and as such standardized analysis pipelines are lacking. RESULTS: We present a one-stop computational pipeline, “Rapid Analysis of ChIP-Seq data” (RACS), that utilizes traditional High-Performance Computing (HPC) techniques in association with open source tools for processing and analyzing raw ChIP-Seq data. RACS is an open source computational pipeline available from any of the following repositories https://bitbucket.org/mjponce/RACS or https://gitrepos.scinet.utoronto.ca/public/?a=summary&p=RACS. RACS is particularly useful for ChIP-Seq in organisms with contig-based genomes that have poor gene annotation to aid protein function discovery.To test the performance and efficiency of RACS, we analyzed ChIP-Seq data previously published in a model organism Tetrahymena thermophila which has a contig-based genome. We assessed the generality of RACS by analyzing a previously published data set generated using the model organism Oxytricha trifallax, whose genome sequence is also contig-based with poor annotation. CONCLUSIONS: The RACS computational pipeline presented in this report is an efficient and reliable tool to analyze genome-wide raw ChIP-Seq data generated in model organisms with poorly annotated contig-based genome sequence. Because RACS segregates the found read accumulations between genic and intergenic regions, it is particularly efficient for rapid downstream analyses of proteins involved in gene expression. |
format | Online Article Text |
id | pubmed-6819487 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-68194872019-10-31 RACS: rapid analysis of ChIP-Seq data for contig based genomes Saettone, Alejandro Ponce, Marcelo Nabeel-Shah, Syed Fillingham, Jeffrey BMC Bioinformatics Methodology Article BACKGROUND: Chromatin immunoprecipitation coupled to next generation sequencing (ChIP-Seq) is a widely-used molecular method to investigate the function of chromatin-related proteins by identifying their associated DNA sequences on a genomic scale. ChIP-Seq generates large quantities of data that is difficult to process and analyze, particularly for organisms with a contig-based sequenced genomes that typically have minimal annotation on their associated set of genes other than their associated coordinates primarily predicted by gene finding programs. Poorly annotated genome sequence makes comprehensive analysis of ChIP-Seq data difficult and as such standardized analysis pipelines are lacking. RESULTS: We present a one-stop computational pipeline, “Rapid Analysis of ChIP-Seq data” (RACS), that utilizes traditional High-Performance Computing (HPC) techniques in association with open source tools for processing and analyzing raw ChIP-Seq data. RACS is an open source computational pipeline available from any of the following repositories https://bitbucket.org/mjponce/RACS or https://gitrepos.scinet.utoronto.ca/public/?a=summary&p=RACS. RACS is particularly useful for ChIP-Seq in organisms with contig-based genomes that have poor gene annotation to aid protein function discovery.To test the performance and efficiency of RACS, we analyzed ChIP-Seq data previously published in a model organism Tetrahymena thermophila which has a contig-based genome. We assessed the generality of RACS by analyzing a previously published data set generated using the model organism Oxytricha trifallax, whose genome sequence is also contig-based with poor annotation. CONCLUSIONS: The RACS computational pipeline presented in this report is an efficient and reliable tool to analyze genome-wide raw ChIP-Seq data generated in model organisms with poorly annotated contig-based genome sequence. Because RACS segregates the found read accumulations between genic and intergenic regions, it is particularly efficient for rapid downstream analyses of proteins involved in gene expression. BioMed Central 2019-10-29 /pmc/articles/PMC6819487/ /pubmed/31664892 http://dx.doi.org/10.1186/s12859-019-3100-2 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Saettone, Alejandro Ponce, Marcelo Nabeel-Shah, Syed Fillingham, Jeffrey RACS: rapid analysis of ChIP-Seq data for contig based genomes |
title | RACS: rapid analysis of ChIP-Seq data for contig based genomes |
title_full | RACS: rapid analysis of ChIP-Seq data for contig based genomes |
title_fullStr | RACS: rapid analysis of ChIP-Seq data for contig based genomes |
title_full_unstemmed | RACS: rapid analysis of ChIP-Seq data for contig based genomes |
title_short | RACS: rapid analysis of ChIP-Seq data for contig based genomes |
title_sort | racs: rapid analysis of chip-seq data for contig based genomes |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6819487/ https://www.ncbi.nlm.nih.gov/pubmed/31664892 http://dx.doi.org/10.1186/s12859-019-3100-2 |
work_keys_str_mv | AT saettonealejandro racsrapidanalysisofchipseqdataforcontigbasedgenomes AT poncemarcelo racsrapidanalysisofchipseqdataforcontigbasedgenomes AT nabeelshahsyed racsrapidanalysisofchipseqdataforcontigbasedgenomes AT fillinghamjeffrey racsrapidanalysisofchipseqdataforcontigbasedgenomes |