Cargando…

A Unified Deep Learning Framework for Single-Cell ATAC-Seq Analysis Based on ProdDep Transformer Encoder

Recent advances in single-cell sequencing assays for the transposase-accessibility chromatin (scATAC-seq) technique have provided cell-specific chromatin accessibility landscapes of cis-regulatory elements, providing deeper insights into cellular states and dynamics. However, few research efforts ha...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Zixuan, Zhang, Yongqing, Yu, Yun, Zhang, Junming, Liu, Yuhang, Zou, Quan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10003007/
https://www.ncbi.nlm.nih.gov/pubmed/36902216
http://dx.doi.org/10.3390/ijms24054784
_version_ 1784904507158691840
author Wang, Zixuan
Zhang, Yongqing
Yu, Yun
Zhang, Junming
Liu, Yuhang
Zou, Quan
author_facet Wang, Zixuan
Zhang, Yongqing
Yu, Yun
Zhang, Junming
Liu, Yuhang
Zou, Quan
author_sort Wang, Zixuan
collection PubMed
description Recent advances in single-cell sequencing assays for the transposase-accessibility chromatin (scATAC-seq) technique have provided cell-specific chromatin accessibility landscapes of cis-regulatory elements, providing deeper insights into cellular states and dynamics. However, few research efforts have been dedicated to modeling the relationship between regulatory grammars and single-cell chromatin accessibility and incorporating different analysis scenarios of scATAC-seq data into the general framework. To this end, we propose a unified deep learning framework based on the ProdDep Transformer Encoder, dubbed PROTRAIT, for scATAC-seq data analysis. Specifically motivated by the deep language model, PROTRAIT leverages the ProdDep Transformer Encoder to capture the syntax of transcription factor (TF)-DNA binding motifs from scATAC-seq peaks for predicting single-cell chromatin accessibility and learning single-cell embedding. Based on cell embedding, PROTRAIT annotates cell types using the Louvain algorithm. Furthermore, according to the identified likely noises of raw scATAC-seq data, PROTRAIT denoises these values based on predated chromatin accessibility. In addition, PROTRAIT employs differential accessibility analysis to infer TF activity at single-cell and single-nucleotide resolution. Extensive experiments based on the Buenrostro2018 dataset validate the effeteness of PROTRAIT for chromatin accessibility prediction, cell type annotation, and scATAC-seq data denoising, therein outperforming current approaches in terms of different evaluation metrics. Besides, we confirm the consistency between the inferred TF activity and the literature review. We also demonstrate the scalability of PROTRAIT to analyze datasets containing over one million cells.
format Online
Article
Text
id pubmed-10003007
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-100030072023-03-11 A Unified Deep Learning Framework for Single-Cell ATAC-Seq Analysis Based on ProdDep Transformer Encoder Wang, Zixuan Zhang, Yongqing Yu, Yun Zhang, Junming Liu, Yuhang Zou, Quan Int J Mol Sci Article Recent advances in single-cell sequencing assays for the transposase-accessibility chromatin (scATAC-seq) technique have provided cell-specific chromatin accessibility landscapes of cis-regulatory elements, providing deeper insights into cellular states and dynamics. However, few research efforts have been dedicated to modeling the relationship between regulatory grammars and single-cell chromatin accessibility and incorporating different analysis scenarios of scATAC-seq data into the general framework. To this end, we propose a unified deep learning framework based on the ProdDep Transformer Encoder, dubbed PROTRAIT, for scATAC-seq data analysis. Specifically motivated by the deep language model, PROTRAIT leverages the ProdDep Transformer Encoder to capture the syntax of transcription factor (TF)-DNA binding motifs from scATAC-seq peaks for predicting single-cell chromatin accessibility and learning single-cell embedding. Based on cell embedding, PROTRAIT annotates cell types using the Louvain algorithm. Furthermore, according to the identified likely noises of raw scATAC-seq data, PROTRAIT denoises these values based on predated chromatin accessibility. In addition, PROTRAIT employs differential accessibility analysis to infer TF activity at single-cell and single-nucleotide resolution. Extensive experiments based on the Buenrostro2018 dataset validate the effeteness of PROTRAIT for chromatin accessibility prediction, cell type annotation, and scATAC-seq data denoising, therein outperforming current approaches in terms of different evaluation metrics. Besides, we confirm the consistency between the inferred TF activity and the literature review. We also demonstrate the scalability of PROTRAIT to analyze datasets containing over one million cells. MDPI 2023-03-01 /pmc/articles/PMC10003007/ /pubmed/36902216 http://dx.doi.org/10.3390/ijms24054784 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Wang, Zixuan
Zhang, Yongqing
Yu, Yun
Zhang, Junming
Liu, Yuhang
Zou, Quan
A Unified Deep Learning Framework for Single-Cell ATAC-Seq Analysis Based on ProdDep Transformer Encoder
title A Unified Deep Learning Framework for Single-Cell ATAC-Seq Analysis Based on ProdDep Transformer Encoder
title_full A Unified Deep Learning Framework for Single-Cell ATAC-Seq Analysis Based on ProdDep Transformer Encoder
title_fullStr A Unified Deep Learning Framework for Single-Cell ATAC-Seq Analysis Based on ProdDep Transformer Encoder
title_full_unstemmed A Unified Deep Learning Framework for Single-Cell ATAC-Seq Analysis Based on ProdDep Transformer Encoder
title_short A Unified Deep Learning Framework for Single-Cell ATAC-Seq Analysis Based on ProdDep Transformer Encoder
title_sort unified deep learning framework for single-cell atac-seq analysis based on proddep transformer encoder
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10003007/
https://www.ncbi.nlm.nih.gov/pubmed/36902216
http://dx.doi.org/10.3390/ijms24054784
work_keys_str_mv AT wangzixuan aunifieddeeplearningframeworkforsinglecellatacseqanalysisbasedonproddeptransformerencoder
AT zhangyongqing aunifieddeeplearningframeworkforsinglecellatacseqanalysisbasedonproddeptransformerencoder
AT yuyun aunifieddeeplearningframeworkforsinglecellatacseqanalysisbasedonproddeptransformerencoder
AT zhangjunming aunifieddeeplearningframeworkforsinglecellatacseqanalysisbasedonproddeptransformerencoder
AT liuyuhang aunifieddeeplearningframeworkforsinglecellatacseqanalysisbasedonproddeptransformerencoder
AT zouquan aunifieddeeplearningframeworkforsinglecellatacseqanalysisbasedonproddeptransformerencoder
AT wangzixuan unifieddeeplearningframeworkforsinglecellatacseqanalysisbasedonproddeptransformerencoder
AT zhangyongqing unifieddeeplearningframeworkforsinglecellatacseqanalysisbasedonproddeptransformerencoder
AT yuyun unifieddeeplearningframeworkforsinglecellatacseqanalysisbasedonproddeptransformerencoder
AT zhangjunming unifieddeeplearningframeworkforsinglecellatacseqanalysisbasedonproddeptransformerencoder
AT liuyuhang unifieddeeplearningframeworkforsinglecellatacseqanalysisbasedonproddeptransformerencoder
AT zouquan unifieddeeplearningframeworkforsinglecellatacseqanalysisbasedonproddeptransformerencoder