Cargando…

Comparative analysis of ChIP-exo peak-callers: impact of data quality, read duplication and binding subtypes

BACKGROUND: ChIP (Chromatin immunoprecipitation)-exo has emerged as an important and versatile improvement over conventional ChIP-seq as it reduces the level of noise, maps the transcription factor (TF) binding location in a very precise manner, upto single base-pair resolution, and enables binding...

Descripción completa

Detalles Bibliográficos
Autores principales: Sharma, Vasudha, Majumdar, Sharmistha
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7035708/
https://www.ncbi.nlm.nih.gov/pubmed/32085702
http://dx.doi.org/10.1186/s12859-020-3403-3
_version_ 1783500110183792640
author Sharma, Vasudha
Majumdar, Sharmistha
author_facet Sharma, Vasudha
Majumdar, Sharmistha
author_sort Sharma, Vasudha
collection PubMed
description BACKGROUND: ChIP (Chromatin immunoprecipitation)-exo has emerged as an important and versatile improvement over conventional ChIP-seq as it reduces the level of noise, maps the transcription factor (TF) binding location in a very precise manner, upto single base-pair resolution, and enables binding mode prediction. Availability of numerous peak-callers for analyzing ChIP-exo reads has motivated the need to assess their performance and report which tool executes reasonably well for the task. RESULTS: This study has focussed on comparing peak-callers that report direct binding events with those that report indirect binding events. The effect of strandedness of reads and duplication of data on the performance of peak-callers has been investigated. The number of peaks reported by each peak-caller is compared followed by a comparison of the annotated motifs present in the reported peaks. The significance of peaks is assessed based on the presence of a motif in top peaks. Indirect binding tools have been compared on the basis of their ability to identify annotated motifs and predict mode of protein-DNA interaction. CONCLUSION: By studying the output of the peak-callers investigated in this study, it is concluded that the tools that use self-learning algorithms, i.e. the tools that estimate all the essential parameters from the aligned reads, perform better than the algorithms which require formation of peak-pairs. The latest tools that account for indirect binding of TFs appear to be an upgrade over the available tools, as they are able to reveal valuable information about the mode of binding in addition to direct binding. Furthermore, the quality of ChIP-exo reads have important consequences on the output of data analysis.
format Online
Article
Text
id pubmed-7035708
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-70357082020-03-02 Comparative analysis of ChIP-exo peak-callers: impact of data quality, read duplication and binding subtypes Sharma, Vasudha Majumdar, Sharmistha BMC Bioinformatics Research Article BACKGROUND: ChIP (Chromatin immunoprecipitation)-exo has emerged as an important and versatile improvement over conventional ChIP-seq as it reduces the level of noise, maps the transcription factor (TF) binding location in a very precise manner, upto single base-pair resolution, and enables binding mode prediction. Availability of numerous peak-callers for analyzing ChIP-exo reads has motivated the need to assess their performance and report which tool executes reasonably well for the task. RESULTS: This study has focussed on comparing peak-callers that report direct binding events with those that report indirect binding events. The effect of strandedness of reads and duplication of data on the performance of peak-callers has been investigated. The number of peaks reported by each peak-caller is compared followed by a comparison of the annotated motifs present in the reported peaks. The significance of peaks is assessed based on the presence of a motif in top peaks. Indirect binding tools have been compared on the basis of their ability to identify annotated motifs and predict mode of protein-DNA interaction. CONCLUSION: By studying the output of the peak-callers investigated in this study, it is concluded that the tools that use self-learning algorithms, i.e. the tools that estimate all the essential parameters from the aligned reads, perform better than the algorithms which require formation of peak-pairs. The latest tools that account for indirect binding of TFs appear to be an upgrade over the available tools, as they are able to reveal valuable information about the mode of binding in addition to direct binding. Furthermore, the quality of ChIP-exo reads have important consequences on the output of data analysis. BioMed Central 2020-02-21 /pmc/articles/PMC7035708/ /pubmed/32085702 http://dx.doi.org/10.1186/s12859-020-3403-3 Text en © The Author(s). 2020 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Sharma, Vasudha
Majumdar, Sharmistha
Comparative analysis of ChIP-exo peak-callers: impact of data quality, read duplication and binding subtypes
title Comparative analysis of ChIP-exo peak-callers: impact of data quality, read duplication and binding subtypes
title_full Comparative analysis of ChIP-exo peak-callers: impact of data quality, read duplication and binding subtypes
title_fullStr Comparative analysis of ChIP-exo peak-callers: impact of data quality, read duplication and binding subtypes
title_full_unstemmed Comparative analysis of ChIP-exo peak-callers: impact of data quality, read duplication and binding subtypes
title_short Comparative analysis of ChIP-exo peak-callers: impact of data quality, read duplication and binding subtypes
title_sort comparative analysis of chip-exo peak-callers: impact of data quality, read duplication and binding subtypes
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7035708/
https://www.ncbi.nlm.nih.gov/pubmed/32085702
http://dx.doi.org/10.1186/s12859-020-3403-3
work_keys_str_mv AT sharmavasudha comparativeanalysisofchipexopeakcallersimpactofdataqualityreadduplicationandbindingsubtypes
AT majumdarsharmistha comparativeanalysisofchipexopeakcallersimpactofdataqualityreadduplicationandbindingsubtypes