Cargando…
Correction of transposase sequence bias in ATAC-seq data with rule ensemble modeling
Chromatin accessibility assays have revolutionized the field of transcription regulation by providing single-nucleotide resolution measurements of regulatory features such as promoters and transcription factor binding sites. ATAC-seq directly measures how well the Tn5 transposase accesses chromatini...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10236359/ https://www.ncbi.nlm.nih.gov/pubmed/37274120 http://dx.doi.org/10.1093/nargab/lqad054 |
_version_ | 1785052901271404544 |
---|---|
author | Wolpe, Jacob B Martins, André L Guertin, Michael J |
author_facet | Wolpe, Jacob B Martins, André L Guertin, Michael J |
author_sort | Wolpe, Jacob B |
collection | PubMed |
description | Chromatin accessibility assays have revolutionized the field of transcription regulation by providing single-nucleotide resolution measurements of regulatory features such as promoters and transcription factor binding sites. ATAC-seq directly measures how well the Tn5 transposase accesses chromatinized DNA. Tn5 has a complex sequence bias that is not effectively scaled with traditional bias-correction methods. We model this complex bias using a rule ensemble machine learning approach that integrates information from many input k-mers proximal to the ATAC sequence reads. We effectively characterize and correct single-nucleotide sequence biases and regional sequence biases of the Tn5 enzyme. Correction of enzymatic sequence bias is an important step in interpreting chromatin accessibility assays that aim to infer transcription factor binding and regulatory activity of elements in the genome. |
format | Online Article Text |
id | pubmed-10236359 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-102363592023-06-03 Correction of transposase sequence bias in ATAC-seq data with rule ensemble modeling Wolpe, Jacob B Martins, André L Guertin, Michael J NAR Genom Bioinform High Throughput Sequencing Methods Chromatin accessibility assays have revolutionized the field of transcription regulation by providing single-nucleotide resolution measurements of regulatory features such as promoters and transcription factor binding sites. ATAC-seq directly measures how well the Tn5 transposase accesses chromatinized DNA. Tn5 has a complex sequence bias that is not effectively scaled with traditional bias-correction methods. We model this complex bias using a rule ensemble machine learning approach that integrates information from many input k-mers proximal to the ATAC sequence reads. We effectively characterize and correct single-nucleotide sequence biases and regional sequence biases of the Tn5 enzyme. Correction of enzymatic sequence bias is an important step in interpreting chromatin accessibility assays that aim to infer transcription factor binding and regulatory activity of elements in the genome. Oxford University Press 2023-06-02 /pmc/articles/PMC10236359/ /pubmed/37274120 http://dx.doi.org/10.1093/nargab/lqad054 Text en © The Author(s) 2023. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | High Throughput Sequencing Methods Wolpe, Jacob B Martins, André L Guertin, Michael J Correction of transposase sequence bias in ATAC-seq data with rule ensemble modeling |
title | Correction of transposase sequence bias in ATAC-seq data with rule ensemble modeling |
title_full | Correction of transposase sequence bias in ATAC-seq data with rule ensemble modeling |
title_fullStr | Correction of transposase sequence bias in ATAC-seq data with rule ensemble modeling |
title_full_unstemmed | Correction of transposase sequence bias in ATAC-seq data with rule ensemble modeling |
title_short | Correction of transposase sequence bias in ATAC-seq data with rule ensemble modeling |
title_sort | correction of transposase sequence bias in atac-seq data with rule ensemble modeling |
topic | High Throughput Sequencing Methods |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10236359/ https://www.ncbi.nlm.nih.gov/pubmed/37274120 http://dx.doi.org/10.1093/nargab/lqad054 |
work_keys_str_mv | AT wolpejacobb correctionoftransposasesequencebiasinatacseqdatawithruleensemblemodeling AT martinsandrel correctionoftransposasesequencebiasinatacseqdatawithruleensemblemodeling AT guertinmichaelj correctionoftransposasesequencebiasinatacseqdatawithruleensemblemodeling |