Cargando…

Correction of transposase sequence bias in ATAC-seq data with rule ensemble modeling

Chromatin accessibility assays have revolutionized the field of transcription regulation by providing single-nucleotide resolution measurements of regulatory features such as promoters and transcription factor binding sites. ATAC-seq directly measures how well the Tn5 transposase accesses chromatini...

Descripción completa

Detalles Bibliográficos
Autores principales: Wolpe, Jacob B, Martins, André L, Guertin, Michael J
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10236359/
https://www.ncbi.nlm.nih.gov/pubmed/37274120
http://dx.doi.org/10.1093/nargab/lqad054
_version_ 1785052901271404544
author Wolpe, Jacob B
Martins, André L
Guertin, Michael J
author_facet Wolpe, Jacob B
Martins, André L
Guertin, Michael J
author_sort Wolpe, Jacob B
collection PubMed
description Chromatin accessibility assays have revolutionized the field of transcription regulation by providing single-nucleotide resolution measurements of regulatory features such as promoters and transcription factor binding sites. ATAC-seq directly measures how well the Tn5 transposase accesses chromatinized DNA. Tn5 has a complex sequence bias that is not effectively scaled with traditional bias-correction methods. We model this complex bias using a rule ensemble machine learning approach that integrates information from many input k-mers proximal to the ATAC sequence reads. We effectively characterize and correct single-nucleotide sequence biases and regional sequence biases of the Tn5 enzyme. Correction of enzymatic sequence bias is an important step in interpreting chromatin accessibility assays that aim to infer transcription factor binding and regulatory activity of elements in the genome.
format Online
Article
Text
id pubmed-10236359
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-102363592023-06-03 Correction of transposase sequence bias in ATAC-seq data with rule ensemble modeling Wolpe, Jacob B Martins, André L Guertin, Michael J NAR Genom Bioinform High Throughput Sequencing Methods Chromatin accessibility assays have revolutionized the field of transcription regulation by providing single-nucleotide resolution measurements of regulatory features such as promoters and transcription factor binding sites. ATAC-seq directly measures how well the Tn5 transposase accesses chromatinized DNA. Tn5 has a complex sequence bias that is not effectively scaled with traditional bias-correction methods. We model this complex bias using a rule ensemble machine learning approach that integrates information from many input k-mers proximal to the ATAC sequence reads. We effectively characterize and correct single-nucleotide sequence biases and regional sequence biases of the Tn5 enzyme. Correction of enzymatic sequence bias is an important step in interpreting chromatin accessibility assays that aim to infer transcription factor binding and regulatory activity of elements in the genome. Oxford University Press 2023-06-02 /pmc/articles/PMC10236359/ /pubmed/37274120 http://dx.doi.org/10.1093/nargab/lqad054 Text en © The Author(s) 2023. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle High Throughput Sequencing Methods
Wolpe, Jacob B
Martins, André L
Guertin, Michael J
Correction of transposase sequence bias in ATAC-seq data with rule ensemble modeling
title Correction of transposase sequence bias in ATAC-seq data with rule ensemble modeling
title_full Correction of transposase sequence bias in ATAC-seq data with rule ensemble modeling
title_fullStr Correction of transposase sequence bias in ATAC-seq data with rule ensemble modeling
title_full_unstemmed Correction of transposase sequence bias in ATAC-seq data with rule ensemble modeling
title_short Correction of transposase sequence bias in ATAC-seq data with rule ensemble modeling
title_sort correction of transposase sequence bias in atac-seq data with rule ensemble modeling
topic High Throughput Sequencing Methods
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10236359/
https://www.ncbi.nlm.nih.gov/pubmed/37274120
http://dx.doi.org/10.1093/nargab/lqad054
work_keys_str_mv AT wolpejacobb correctionoftransposasesequencebiasinatacseqdatawithruleensemblemodeling
AT martinsandrel correctionoftransposasesequencebiasinatacseqdatawithruleensemblemodeling
AT guertinmichaelj correctionoftransposasesequencebiasinatacseqdatawithruleensemblemodeling