Cargando…
ChIP-seq Analysis in R (CSAR): An R package for the statistical detection of protein-bound genomic regions
BACKGROUND: In vivo detection of protein-bound genomic regions can be achieved by combining chromatin-immunoprecipitation with next-generation sequencing technology (ChIP-seq). The large amount of sequence data produced by this method needs to be analyzed in a statistically proper and computationall...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3114017/ https://www.ncbi.nlm.nih.gov/pubmed/21554688 http://dx.doi.org/10.1186/1746-4811-7-11 |
_version_ | 1782206021378244608 |
---|---|
author | Muiño, Jose M Kaufmann, Kerstin van Ham, Roeland CHJ Angenent, Gerco C Krajewski, Pawel |
author_facet | Muiño, Jose M Kaufmann, Kerstin van Ham, Roeland CHJ Angenent, Gerco C Krajewski, Pawel |
author_sort | Muiño, Jose M |
collection | PubMed |
description | BACKGROUND: In vivo detection of protein-bound genomic regions can be achieved by combining chromatin-immunoprecipitation with next-generation sequencing technology (ChIP-seq). The large amount of sequence data produced by this method needs to be analyzed in a statistically proper and computationally efficient manner. The generation of high copy numbers of DNA fragments as an artifact of the PCR step in ChIP-seq is an important source of bias of this methodology. RESULTS: We present here an R package for the statistical analysis of ChIP-seq experiments. Taking the average size of DNA fragments subjected to sequencing into account, the software calculates single-nucleotide read-enrichment values. After normalization, sample and control are compared using a test based on the ratio test or the Poisson distribution. Test statistic thresholds to control the false discovery rate are obtained through random permutations. Computational efficiency is achieved by implementing the most time-consuming functions in C++ and integrating these in the R package. An analysis of simulated and experimental ChIP-seq data is presented to demonstrate the robustness of our method against PCR-artefacts and its adequate control of the error rate. CONCLUSIONS: The software ChIP-seq Analysis in R (CSAR) enables fast and accurate detection of protein-bound genomic regions through the analysis of ChIP-seq experiments. Compared to existing methods, we found that our package shows greater robustness against PCR-artefacts and better control of the error rate. |
format | Online Article Text |
id | pubmed-3114017 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-31140172011-06-14 ChIP-seq Analysis in R (CSAR): An R package for the statistical detection of protein-bound genomic regions Muiño, Jose M Kaufmann, Kerstin van Ham, Roeland CHJ Angenent, Gerco C Krajewski, Pawel Plant Methods Software BACKGROUND: In vivo detection of protein-bound genomic regions can be achieved by combining chromatin-immunoprecipitation with next-generation sequencing technology (ChIP-seq). The large amount of sequence data produced by this method needs to be analyzed in a statistically proper and computationally efficient manner. The generation of high copy numbers of DNA fragments as an artifact of the PCR step in ChIP-seq is an important source of bias of this methodology. RESULTS: We present here an R package for the statistical analysis of ChIP-seq experiments. Taking the average size of DNA fragments subjected to sequencing into account, the software calculates single-nucleotide read-enrichment values. After normalization, sample and control are compared using a test based on the ratio test or the Poisson distribution. Test statistic thresholds to control the false discovery rate are obtained through random permutations. Computational efficiency is achieved by implementing the most time-consuming functions in C++ and integrating these in the R package. An analysis of simulated and experimental ChIP-seq data is presented to demonstrate the robustness of our method against PCR-artefacts and its adequate control of the error rate. CONCLUSIONS: The software ChIP-seq Analysis in R (CSAR) enables fast and accurate detection of protein-bound genomic regions through the analysis of ChIP-seq experiments. Compared to existing methods, we found that our package shows greater robustness against PCR-artefacts and better control of the error rate. BioMed Central 2011-05-09 /pmc/articles/PMC3114017/ /pubmed/21554688 http://dx.doi.org/10.1186/1746-4811-7-11 Text en Copyright ©2011 Muiño et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Software Muiño, Jose M Kaufmann, Kerstin van Ham, Roeland CHJ Angenent, Gerco C Krajewski, Pawel ChIP-seq Analysis in R (CSAR): An R package for the statistical detection of protein-bound genomic regions |
title | ChIP-seq Analysis in R (CSAR): An R package for the statistical detection of protein-bound genomic regions |
title_full | ChIP-seq Analysis in R (CSAR): An R package for the statistical detection of protein-bound genomic regions |
title_fullStr | ChIP-seq Analysis in R (CSAR): An R package for the statistical detection of protein-bound genomic regions |
title_full_unstemmed | ChIP-seq Analysis in R (CSAR): An R package for the statistical detection of protein-bound genomic regions |
title_short | ChIP-seq Analysis in R (CSAR): An R package for the statistical detection of protein-bound genomic regions |
title_sort | chip-seq analysis in r (csar): an r package for the statistical detection of protein-bound genomic regions |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3114017/ https://www.ncbi.nlm.nih.gov/pubmed/21554688 http://dx.doi.org/10.1186/1746-4811-7-11 |
work_keys_str_mv | AT muinojosem chipseqanalysisinrcsaranrpackageforthestatisticaldetectionofproteinboundgenomicregions AT kaufmannkerstin chipseqanalysisinrcsaranrpackageforthestatisticaldetectionofproteinboundgenomicregions AT vanhamroelandchj chipseqanalysisinrcsaranrpackageforthestatisticaldetectionofproteinboundgenomicregions AT angenentgercoc chipseqanalysisinrcsaranrpackageforthestatisticaldetectionofproteinboundgenomicregions AT krajewskipawel chipseqanalysisinrcsaranrpackageforthestatisticaldetectionofproteinboundgenomicregions |