Cargando…
A Unified Deep Learning Framework for Single-Cell ATAC-Seq Analysis Based on ProdDep Transformer Encoder
Recent advances in single-cell sequencing assays for the transposase-accessibility chromatin (scATAC-seq) technique have provided cell-specific chromatin accessibility landscapes of cis-regulatory elements, providing deeper insights into cellular states and dynamics. However, few research efforts ha...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10003007/ https://www.ncbi.nlm.nih.gov/pubmed/36902216 http://dx.doi.org/10.3390/ijms24054784 |
_version_ | 1784904507158691840 |
---|---|
author | Wang, Zixuan Zhang, Yongqing Yu, Yun Zhang, Junming Liu, Yuhang Zou, Quan |
author_facet | Wang, Zixuan Zhang, Yongqing Yu, Yun Zhang, Junming Liu, Yuhang Zou, Quan |
author_sort | Wang, Zixuan |
collection | PubMed |
description | Recent advances in single-cell sequencing assays for the transposase-accessibility chromatin (scATAC-seq) technique have provided cell-specific chromatin accessibility landscapes of cis-regulatory elements, providing deeper insights into cellular states and dynamics. However, few research efforts have been dedicated to modeling the relationship between regulatory grammars and single-cell chromatin accessibility and incorporating different analysis scenarios of scATAC-seq data into the general framework. To this end, we propose a unified deep learning framework based on the ProdDep Transformer Encoder, dubbed PROTRAIT, for scATAC-seq data analysis. Specifically motivated by the deep language model, PROTRAIT leverages the ProdDep Transformer Encoder to capture the syntax of transcription factor (TF)-DNA binding motifs from scATAC-seq peaks for predicting single-cell chromatin accessibility and learning single-cell embedding. Based on cell embedding, PROTRAIT annotates cell types using the Louvain algorithm. Furthermore, according to the identified likely noises of raw scATAC-seq data, PROTRAIT denoises these values based on predated chromatin accessibility. In addition, PROTRAIT employs differential accessibility analysis to infer TF activity at single-cell and single-nucleotide resolution. Extensive experiments based on the Buenrostro2018 dataset validate the effeteness of PROTRAIT for chromatin accessibility prediction, cell type annotation, and scATAC-seq data denoising, therein outperforming current approaches in terms of different evaluation metrics. Besides, we confirm the consistency between the inferred TF activity and the literature review. We also demonstrate the scalability of PROTRAIT to analyze datasets containing over one million cells. |
format | Online Article Text |
id | pubmed-10003007 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-100030072023-03-11 A Unified Deep Learning Framework for Single-Cell ATAC-Seq Analysis Based on ProdDep Transformer Encoder Wang, Zixuan Zhang, Yongqing Yu, Yun Zhang, Junming Liu, Yuhang Zou, Quan Int J Mol Sci Article Recent advances in single-cell sequencing assays for the transposase-accessibility chromatin (scATAC-seq) technique have provided cell-specific chromatin accessibility landscapes of cis-regulatory elements, providing deeper insights into cellular states and dynamics. However, few research efforts have been dedicated to modeling the relationship between regulatory grammars and single-cell chromatin accessibility and incorporating different analysis scenarios of scATAC-seq data into the general framework. To this end, we propose a unified deep learning framework based on the ProdDep Transformer Encoder, dubbed PROTRAIT, for scATAC-seq data analysis. Specifically motivated by the deep language model, PROTRAIT leverages the ProdDep Transformer Encoder to capture the syntax of transcription factor (TF)-DNA binding motifs from scATAC-seq peaks for predicting single-cell chromatin accessibility and learning single-cell embedding. Based on cell embedding, PROTRAIT annotates cell types using the Louvain algorithm. Furthermore, according to the identified likely noises of raw scATAC-seq data, PROTRAIT denoises these values based on predated chromatin accessibility. In addition, PROTRAIT employs differential accessibility analysis to infer TF activity at single-cell and single-nucleotide resolution. Extensive experiments based on the Buenrostro2018 dataset validate the effeteness of PROTRAIT for chromatin accessibility prediction, cell type annotation, and scATAC-seq data denoising, therein outperforming current approaches in terms of different evaluation metrics. Besides, we confirm the consistency between the inferred TF activity and the literature review. We also demonstrate the scalability of PROTRAIT to analyze datasets containing over one million cells. MDPI 2023-03-01 /pmc/articles/PMC10003007/ /pubmed/36902216 http://dx.doi.org/10.3390/ijms24054784 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Wang, Zixuan Zhang, Yongqing Yu, Yun Zhang, Junming Liu, Yuhang Zou, Quan A Unified Deep Learning Framework for Single-Cell ATAC-Seq Analysis Based on ProdDep Transformer Encoder |
title | A Unified Deep Learning Framework for Single-Cell ATAC-Seq Analysis Based on ProdDep Transformer Encoder |
title_full | A Unified Deep Learning Framework for Single-Cell ATAC-Seq Analysis Based on ProdDep Transformer Encoder |
title_fullStr | A Unified Deep Learning Framework for Single-Cell ATAC-Seq Analysis Based on ProdDep Transformer Encoder |
title_full_unstemmed | A Unified Deep Learning Framework for Single-Cell ATAC-Seq Analysis Based on ProdDep Transformer Encoder |
title_short | A Unified Deep Learning Framework for Single-Cell ATAC-Seq Analysis Based on ProdDep Transformer Encoder |
title_sort | unified deep learning framework for single-cell atac-seq analysis based on proddep transformer encoder |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10003007/ https://www.ncbi.nlm.nih.gov/pubmed/36902216 http://dx.doi.org/10.3390/ijms24054784 |
work_keys_str_mv | AT wangzixuan aunifieddeeplearningframeworkforsinglecellatacseqanalysisbasedonproddeptransformerencoder AT zhangyongqing aunifieddeeplearningframeworkforsinglecellatacseqanalysisbasedonproddeptransformerencoder AT yuyun aunifieddeeplearningframeworkforsinglecellatacseqanalysisbasedonproddeptransformerencoder AT zhangjunming aunifieddeeplearningframeworkforsinglecellatacseqanalysisbasedonproddeptransformerencoder AT liuyuhang aunifieddeeplearningframeworkforsinglecellatacseqanalysisbasedonproddeptransformerencoder AT zouquan aunifieddeeplearningframeworkforsinglecellatacseqanalysisbasedonproddeptransformerencoder AT wangzixuan unifieddeeplearningframeworkforsinglecellatacseqanalysisbasedonproddeptransformerencoder AT zhangyongqing unifieddeeplearningframeworkforsinglecellatacseqanalysisbasedonproddeptransformerencoder AT yuyun unifieddeeplearningframeworkforsinglecellatacseqanalysisbasedonproddeptransformerencoder AT zhangjunming unifieddeeplearningframeworkforsinglecellatacseqanalysisbasedonproddeptransformerencoder AT liuyuhang unifieddeeplearningframeworkforsinglecellatacseqanalysisbasedonproddeptransformerencoder AT zouquan unifieddeeplearningframeworkforsinglecellatacseqanalysisbasedonproddeptransformerencoder |