Cargando…

Learning single-cell chromatin accessibility profiles using meta-analytic marker genes

MOTIVATION: Single-cell assay for transposase accessible chromatin using sequencing (scATAC-seq) is a valuable resource to learn cis-regulatory elements such as cell-type specific enhancers and transcription factor binding sites. However, cell-type identification of scATAC-seq data is known to be ch...

Descripción completa

Detalles Bibliográficos
Autores principales: Kawaguchi, Risa Karakida, Tang, Ziqi, Fischer, Stephan, Rajesh, Chandana, Tripathy, Rohit, Koo, Peter K, Gillis, Jesse
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9851328/
https://www.ncbi.nlm.nih.gov/pubmed/36549922
http://dx.doi.org/10.1093/bib/bbac541
_version_ 1784872372105379840
author Kawaguchi, Risa Karakida
Tang, Ziqi
Fischer, Stephan
Rajesh, Chandana
Tripathy, Rohit
Koo, Peter K
Gillis, Jesse
author_facet Kawaguchi, Risa Karakida
Tang, Ziqi
Fischer, Stephan
Rajesh, Chandana
Tripathy, Rohit
Koo, Peter K
Gillis, Jesse
author_sort Kawaguchi, Risa Karakida
collection PubMed
description MOTIVATION: Single-cell assay for transposase accessible chromatin using sequencing (scATAC-seq) is a valuable resource to learn cis-regulatory elements such as cell-type specific enhancers and transcription factor binding sites. However, cell-type identification of scATAC-seq data is known to be challenging due to the heterogeneity derived from different protocols and the high dropout rate. RESULTS: In this study, we perform a systematic comparison of seven scATAC-seq datasets of mouse brain to benchmark the efficacy of neuronal cell-type annotation from gene sets. We find that redundant marker genes give a dramatic improvement for a sparse scATAC-seq annotation across the data collected from different studies. Interestingly, simple aggregation of such marker genes achieves performance comparable or higher than that of machine-learning classifiers, suggesting its potential for downstream applications. Based on our results, we reannotated all scATAC-seq data for detailed cell types using robust marker genes. Their meta scATAC-seq profiles are publicly available at https://gillisweb.cshl.edu/Meta_scATAC. Furthermore, we trained a deep neural network to predict chromatin accessibility from only DNA sequence and identified key motifs enriched for each neuronal subtype. Those predicted profiles are visualized together in our database as a valuable resource to explore cell-type specific epigenetic regulation in a sequence-dependent and -independent manner.
format Online
Article
Text
id pubmed-9851328
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-98513282023-01-20 Learning single-cell chromatin accessibility profiles using meta-analytic marker genes Kawaguchi, Risa Karakida Tang, Ziqi Fischer, Stephan Rajesh, Chandana Tripathy, Rohit Koo, Peter K Gillis, Jesse Brief Bioinform Problem Solving Protocol MOTIVATION: Single-cell assay for transposase accessible chromatin using sequencing (scATAC-seq) is a valuable resource to learn cis-regulatory elements such as cell-type specific enhancers and transcription factor binding sites. However, cell-type identification of scATAC-seq data is known to be challenging due to the heterogeneity derived from different protocols and the high dropout rate. RESULTS: In this study, we perform a systematic comparison of seven scATAC-seq datasets of mouse brain to benchmark the efficacy of neuronal cell-type annotation from gene sets. We find that redundant marker genes give a dramatic improvement for a sparse scATAC-seq annotation across the data collected from different studies. Interestingly, simple aggregation of such marker genes achieves performance comparable or higher than that of machine-learning classifiers, suggesting its potential for downstream applications. Based on our results, we reannotated all scATAC-seq data for detailed cell types using robust marker genes. Their meta scATAC-seq profiles are publicly available at https://gillisweb.cshl.edu/Meta_scATAC. Furthermore, we trained a deep neural network to predict chromatin accessibility from only DNA sequence and identified key motifs enriched for each neuronal subtype. Those predicted profiles are visualized together in our database as a valuable resource to explore cell-type specific epigenetic regulation in a sequence-dependent and -independent manner. Oxford University Press 2022-12-22 /pmc/articles/PMC9851328/ /pubmed/36549922 http://dx.doi.org/10.1093/bib/bbac541 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Problem Solving Protocol
Kawaguchi, Risa Karakida
Tang, Ziqi
Fischer, Stephan
Rajesh, Chandana
Tripathy, Rohit
Koo, Peter K
Gillis, Jesse
Learning single-cell chromatin accessibility profiles using meta-analytic marker genes
title Learning single-cell chromatin accessibility profiles using meta-analytic marker genes
title_full Learning single-cell chromatin accessibility profiles using meta-analytic marker genes
title_fullStr Learning single-cell chromatin accessibility profiles using meta-analytic marker genes
title_full_unstemmed Learning single-cell chromatin accessibility profiles using meta-analytic marker genes
title_short Learning single-cell chromatin accessibility profiles using meta-analytic marker genes
title_sort learning single-cell chromatin accessibility profiles using meta-analytic marker genes
topic Problem Solving Protocol
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9851328/
https://www.ncbi.nlm.nih.gov/pubmed/36549922
http://dx.doi.org/10.1093/bib/bbac541
work_keys_str_mv AT kawaguchirisakarakida learningsinglecellchromatinaccessibilityprofilesusingmetaanalyticmarkergenes
AT tangziqi learningsinglecellchromatinaccessibilityprofilesusingmetaanalyticmarkergenes
AT fischerstephan learningsinglecellchromatinaccessibilityprofilesusingmetaanalyticmarkergenes
AT rajeshchandana learningsinglecellchromatinaccessibilityprofilesusingmetaanalyticmarkergenes
AT tripathyrohit learningsinglecellchromatinaccessibilityprofilesusingmetaanalyticmarkergenes
AT koopeterk learningsinglecellchromatinaccessibilityprofilesusingmetaanalyticmarkergenes
AT gillisjesse learningsinglecellchromatinaccessibilityprofilesusingmetaanalyticmarkergenes