Cargando…

The CUT&RUN suspect list of problematic regions of the genome

BACKGROUND: Cleavage Under Targets and Release Using Nuclease (CUT&RUN) is an increasingly popular technique to map genome-wide binding profiles of histone modifications, transcription factors, and co-factors. The ENCODE project and others have compiled blacklists for ChIP-seq which have been wi...

Descripción completa

Detalles Bibliográficos
Autores principales: Nordin, Anna, Zambanini, Gianluca, Pagella, Pierfrancesco, Cantù, Claudio
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10416431/
https://www.ncbi.nlm.nih.gov/pubmed/37563719
http://dx.doi.org/10.1186/s13059-023-03027-3
_version_ 1785087774023483392
author Nordin, Anna
Zambanini, Gianluca
Pagella, Pierfrancesco
Cantù, Claudio
author_facet Nordin, Anna
Zambanini, Gianluca
Pagella, Pierfrancesco
Cantù, Claudio
author_sort Nordin, Anna
collection PubMed
description BACKGROUND: Cleavage Under Targets and Release Using Nuclease (CUT&RUN) is an increasingly popular technique to map genome-wide binding profiles of histone modifications, transcription factors, and co-factors. The ENCODE project and others have compiled blacklists for ChIP-seq which have been widely adopted: these lists contain regions of high and unstructured signal, regardless of cell type or protein target, indicating that these are false positives. While CUT&RUN obtains similar results to ChIP-seq, its biochemistry and subsequent data analyses are different. We found that this results in a CUT&RUN-specific set of undesired high-signal regions. RESULTS: We compile suspect lists based on CUT&RUN data for the human and mouse genomes, identifying regions consistently called as peaks in negative controls. Using published CUT&RUN data from our and other labs, we show that the CUT&RUN suspect regions can persist even when peak calling is performed with SEACR or MACS2 against a negative control and after ENCODE blacklist removal. Moreover, we experimentally validate the CUT&RUN suspect lists by performing reiterative negative control experiments in which no specific protein is targeted, showing that they capture more than 80% of the peaks identified. CONCLUSIONS: We propose that removing these problematic regions can substantially improve peak calling in CUT&RUN experiments, resulting in more reliable datasets. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-023-03027-3.
format Online
Article
Text
id pubmed-10416431
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-104164312023-08-12 The CUT&RUN suspect list of problematic regions of the genome Nordin, Anna Zambanini, Gianluca Pagella, Pierfrancesco Cantù, Claudio Genome Biol Research BACKGROUND: Cleavage Under Targets and Release Using Nuclease (CUT&RUN) is an increasingly popular technique to map genome-wide binding profiles of histone modifications, transcription factors, and co-factors. The ENCODE project and others have compiled blacklists for ChIP-seq which have been widely adopted: these lists contain regions of high and unstructured signal, regardless of cell type or protein target, indicating that these are false positives. While CUT&RUN obtains similar results to ChIP-seq, its biochemistry and subsequent data analyses are different. We found that this results in a CUT&RUN-specific set of undesired high-signal regions. RESULTS: We compile suspect lists based on CUT&RUN data for the human and mouse genomes, identifying regions consistently called as peaks in negative controls. Using published CUT&RUN data from our and other labs, we show that the CUT&RUN suspect regions can persist even when peak calling is performed with SEACR or MACS2 against a negative control and after ENCODE blacklist removal. Moreover, we experimentally validate the CUT&RUN suspect lists by performing reiterative negative control experiments in which no specific protein is targeted, showing that they capture more than 80% of the peaks identified. CONCLUSIONS: We propose that removing these problematic regions can substantially improve peak calling in CUT&RUN experiments, resulting in more reliable datasets. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-023-03027-3. BioMed Central 2023-08-10 /pmc/articles/PMC10416431/ /pubmed/37563719 http://dx.doi.org/10.1186/s13059-023-03027-3 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Nordin, Anna
Zambanini, Gianluca
Pagella, Pierfrancesco
Cantù, Claudio
The CUT&RUN suspect list of problematic regions of the genome
title The CUT&RUN suspect list of problematic regions of the genome
title_full The CUT&RUN suspect list of problematic regions of the genome
title_fullStr The CUT&RUN suspect list of problematic regions of the genome
title_full_unstemmed The CUT&RUN suspect list of problematic regions of the genome
title_short The CUT&RUN suspect list of problematic regions of the genome
title_sort cut&run suspect list of problematic regions of the genome
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10416431/
https://www.ncbi.nlm.nih.gov/pubmed/37563719
http://dx.doi.org/10.1186/s13059-023-03027-3
work_keys_str_mv AT nordinanna thecutrunsuspectlistofproblematicregionsofthegenome
AT zambaninigianluca thecutrunsuspectlistofproblematicregionsofthegenome
AT pagellapierfrancesco thecutrunsuspectlistofproblematicregionsofthegenome
AT cantuclaudio thecutrunsuspectlistofproblematicregionsofthegenome
AT nordinanna cutrunsuspectlistofproblematicregionsofthegenome
AT zambaninigianluca cutrunsuspectlistofproblematicregionsofthegenome
AT pagellapierfrancesco cutrunsuspectlistofproblematicregionsofthegenome
AT cantuclaudio cutrunsuspectlistofproblematicregionsofthegenome