Cargando…

WACS: improving ChIP-seq peak calling by optimally weighting controls

BACKGROUND: Chromatin immunoprecipitation followed by high throughput sequencing (ChIP-seq), initially introduced more than a decade ago, is widely used by the scientific community to detect protein/DNA binding and histone modifications across the genome. Every experiment is prone to noise and bias,...

Descripción completa

Detalles Bibliográficos
Autores principales: Awdeh, Aseel, Turcotte, Marcel, Perkins, Theodore J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7885521/
https://www.ncbi.nlm.nih.gov/pubmed/33588754
http://dx.doi.org/10.1186/s12859-020-03927-2
_version_ 1783651622589562880
author Awdeh, Aseel
Turcotte, Marcel
Perkins, Theodore J.
author_facet Awdeh, Aseel
Turcotte, Marcel
Perkins, Theodore J.
author_sort Awdeh, Aseel
collection PubMed
description BACKGROUND: Chromatin immunoprecipitation followed by high throughput sequencing (ChIP-seq), initially introduced more than a decade ago, is widely used by the scientific community to detect protein/DNA binding and histone modifications across the genome. Every experiment is prone to noise and bias, and ChIP-seq experiments are no exception. To alleviate bias, the incorporation of control datasets in ChIP-seq analysis is an essential step. The controls are used to account for the background signal, while the remainder of the ChIP-seq signal captures true binding or histone modification. However, a recurrent issue is different types of bias in different ChIP-seq experiments. Depending on which controls are used, different aspects of ChIP-seq bias are better or worse accounted for, and peak calling can produce different results for the same ChIP-seq experiment. Consequently, generating “smart” controls, which model the non-signal effect for a specific ChIP-seq experiment, could enhance contrast and increase the reliability and reproducibility of the results. RESULT: We propose a peak calling algorithm, Weighted Analysis of ChIP-seq (WACS), which is an extension of the well-known peak caller MACS2. There are two main steps in WACS: First, weights are estimated for each control using non-negative least squares regression. The goal is to customize controls to model the noise distribution for each ChIP-seq experiment. This is then followed by peak calling. We demonstrate that WACS significantly outperforms MACS2 and AIControl, another recent algorithm for generating smart controls, in the detection of enriched regions along the genome, in terms of motif enrichment and reproducibility analyses. CONCLUSIONS: This ultimately improves our understanding of ChIP-seq controls and their biases, and shows that WACS results in a better approximation of the noise distribution in controls.
format Online
Article
Text
id pubmed-7885521
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-78855212021-02-17 WACS: improving ChIP-seq peak calling by optimally weighting controls Awdeh, Aseel Turcotte, Marcel Perkins, Theodore J. BMC Bioinformatics Research Article BACKGROUND: Chromatin immunoprecipitation followed by high throughput sequencing (ChIP-seq), initially introduced more than a decade ago, is widely used by the scientific community to detect protein/DNA binding and histone modifications across the genome. Every experiment is prone to noise and bias, and ChIP-seq experiments are no exception. To alleviate bias, the incorporation of control datasets in ChIP-seq analysis is an essential step. The controls are used to account for the background signal, while the remainder of the ChIP-seq signal captures true binding or histone modification. However, a recurrent issue is different types of bias in different ChIP-seq experiments. Depending on which controls are used, different aspects of ChIP-seq bias are better or worse accounted for, and peak calling can produce different results for the same ChIP-seq experiment. Consequently, generating “smart” controls, which model the non-signal effect for a specific ChIP-seq experiment, could enhance contrast and increase the reliability and reproducibility of the results. RESULT: We propose a peak calling algorithm, Weighted Analysis of ChIP-seq (WACS), which is an extension of the well-known peak caller MACS2. There are two main steps in WACS: First, weights are estimated for each control using non-negative least squares regression. The goal is to customize controls to model the noise distribution for each ChIP-seq experiment. This is then followed by peak calling. We demonstrate that WACS significantly outperforms MACS2 and AIControl, another recent algorithm for generating smart controls, in the detection of enriched regions along the genome, in terms of motif enrichment and reproducibility analyses. CONCLUSIONS: This ultimately improves our understanding of ChIP-seq controls and their biases, and shows that WACS results in a better approximation of the noise distribution in controls. BioMed Central 2021-02-15 /pmc/articles/PMC7885521/ /pubmed/33588754 http://dx.doi.org/10.1186/s12859-020-03927-2 Text en © The Author(s) 2021 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research Article
Awdeh, Aseel
Turcotte, Marcel
Perkins, Theodore J.
WACS: improving ChIP-seq peak calling by optimally weighting controls
title WACS: improving ChIP-seq peak calling by optimally weighting controls
title_full WACS: improving ChIP-seq peak calling by optimally weighting controls
title_fullStr WACS: improving ChIP-seq peak calling by optimally weighting controls
title_full_unstemmed WACS: improving ChIP-seq peak calling by optimally weighting controls
title_short WACS: improving ChIP-seq peak calling by optimally weighting controls
title_sort wacs: improving chip-seq peak calling by optimally weighting controls
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7885521/
https://www.ncbi.nlm.nih.gov/pubmed/33588754
http://dx.doi.org/10.1186/s12859-020-03927-2
work_keys_str_mv AT awdehaseel wacsimprovingchipseqpeakcallingbyoptimallyweightingcontrols
AT turcottemarcel wacsimprovingchipseqpeakcallingbyoptimallyweightingcontrols
AT perkinstheodorej wacsimprovingchipseqpeakcallingbyoptimallyweightingcontrols