Cargando…

T3E: a tool for characterising the epigenetic profile of transposable elements using ChIP-seq data

BACKGROUND: Despite the advent of Chromatin Immunoprecipitation Sequencing (ChIP-seq) having revolutionised our understanding of the mammalian genome’s regulatory landscape, many challenges remain. In particular, because of their repetitive nature, the sequencing reads derived from transposable elem...

Descripción completa

Detalles Bibliográficos
Autores principales: Almeida da Paz, Michelle, Taher, Leila
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9710123/
https://www.ncbi.nlm.nih.gov/pubmed/36451223
http://dx.doi.org/10.1186/s13100-022-00285-z
_version_ 1784841302042476544
author Almeida da Paz, Michelle
Taher, Leila
author_facet Almeida da Paz, Michelle
Taher, Leila
author_sort Almeida da Paz, Michelle
collection PubMed
description BACKGROUND: Despite the advent of Chromatin Immunoprecipitation Sequencing (ChIP-seq) having revolutionised our understanding of the mammalian genome’s regulatory landscape, many challenges remain. In particular, because of their repetitive nature, the sequencing reads derived from transposable elements (TEs) pose a real bioinformatics challenge, to the point that standard analysis pipelines typically ignore reads whose genomic origin cannot be unambiguously ascertained. RESULTS: We show that discarding ambiguously mapping reads may lead to a systematic underestimation of the number of reads associated with young TE families/subfamilies. We also provide evidence suggesting that the strategy of randomly permuting the location of the read mappings (or the TEs) that is often used to compute the background for enrichment calculations at TE families/subfamilies can result in both false positive and negative enrichments. To address these problems, we present the Transposable Element Enrichment Estimator (T3E), a tool that makes use of ChIP-seq data to characterise the epigenetic profile of associated TE families/subfamilies. T3E weights the number of read mappings assigned to the individual TE copies of a family/subfamily by the overall number of genomic loci to which the corresponding reads map, and this is done at the single nucleotide level. In addition, T3E computes ChIP-seq enrichment relative to a background estimated based on the distribution of the read mappings in the input control DNA. We demonstrated the capabilities of T3E on 23 different ChIP-seq libraries. T3E identified enrichments that were consistent with previous studies. Furthermore, T3E detected context-specific enrichments that are likely to pinpoint unexplored TE families/subfamilies with individual TE copies that have been frequently exapted as cis-regulatory elements during the evolution of mammalian regulatory networks. CONCLUSIONS: T3E is a novel open-source computational tool (available for use at: https://github.com/michelleapaz/T3E) that overcomes some of the pitfalls associated with the analysis of ChIP-seq data arising from the repetitive mammalian genome and provides a framework to shed light on the epigenetics of entire TE families/subfamilies. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13100-022-00285-z.
format Online
Article
Text
id pubmed-9710123
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-97101232022-12-01 T3E: a tool for characterising the epigenetic profile of transposable elements using ChIP-seq data Almeida da Paz, Michelle Taher, Leila Mob DNA Methodology BACKGROUND: Despite the advent of Chromatin Immunoprecipitation Sequencing (ChIP-seq) having revolutionised our understanding of the mammalian genome’s regulatory landscape, many challenges remain. In particular, because of their repetitive nature, the sequencing reads derived from transposable elements (TEs) pose a real bioinformatics challenge, to the point that standard analysis pipelines typically ignore reads whose genomic origin cannot be unambiguously ascertained. RESULTS: We show that discarding ambiguously mapping reads may lead to a systematic underestimation of the number of reads associated with young TE families/subfamilies. We also provide evidence suggesting that the strategy of randomly permuting the location of the read mappings (or the TEs) that is often used to compute the background for enrichment calculations at TE families/subfamilies can result in both false positive and negative enrichments. To address these problems, we present the Transposable Element Enrichment Estimator (T3E), a tool that makes use of ChIP-seq data to characterise the epigenetic profile of associated TE families/subfamilies. T3E weights the number of read mappings assigned to the individual TE copies of a family/subfamily by the overall number of genomic loci to which the corresponding reads map, and this is done at the single nucleotide level. In addition, T3E computes ChIP-seq enrichment relative to a background estimated based on the distribution of the read mappings in the input control DNA. We demonstrated the capabilities of T3E on 23 different ChIP-seq libraries. T3E identified enrichments that were consistent with previous studies. Furthermore, T3E detected context-specific enrichments that are likely to pinpoint unexplored TE families/subfamilies with individual TE copies that have been frequently exapted as cis-regulatory elements during the evolution of mammalian regulatory networks. CONCLUSIONS: T3E is a novel open-source computational tool (available for use at: https://github.com/michelleapaz/T3E) that overcomes some of the pitfalls associated with the analysis of ChIP-seq data arising from the repetitive mammalian genome and provides a framework to shed light on the epigenetics of entire TE families/subfamilies. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13100-022-00285-z. BioMed Central 2022-11-30 /pmc/articles/PMC9710123/ /pubmed/36451223 http://dx.doi.org/10.1186/s13100-022-00285-z Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology
Almeida da Paz, Michelle
Taher, Leila
T3E: a tool for characterising the epigenetic profile of transposable elements using ChIP-seq data
title T3E: a tool for characterising the epigenetic profile of transposable elements using ChIP-seq data
title_full T3E: a tool for characterising the epigenetic profile of transposable elements using ChIP-seq data
title_fullStr T3E: a tool for characterising the epigenetic profile of transposable elements using ChIP-seq data
title_full_unstemmed T3E: a tool for characterising the epigenetic profile of transposable elements using ChIP-seq data
title_short T3E: a tool for characterising the epigenetic profile of transposable elements using ChIP-seq data
title_sort t3e: a tool for characterising the epigenetic profile of transposable elements using chip-seq data
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9710123/
https://www.ncbi.nlm.nih.gov/pubmed/36451223
http://dx.doi.org/10.1186/s13100-022-00285-z
work_keys_str_mv AT almeidadapazmichelle t3eatoolforcharacterisingtheepigeneticprofileoftransposableelementsusingchipseqdata
AT taherleila t3eatoolforcharacterisingtheepigeneticprofileoftransposableelementsusingchipseqdata