Cargando…

SECANT: a biology-guided semi-supervised method for clustering, classification, and annotation of single-cell multi-omics

The recent advance of single cell sequencing (scRNA-seq) technology such as Cellular Indexing of Transcriptomes and Epitopes by Sequencing (CITE-seq) allows researchers to quantify cell surface protein abundance and RNA expression simultaneously at single cell resolution. Although CITE-seq and other...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Xinjun, Xu, Zhongli, Hu, Haoran, Zhou, Xueping, Zhang, Yanfu, Lafyatis, Robert, Chen, Kong, Huang, Heng, Ding, Ying, Duerr, Richard H, Chen, Wei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9491696/
https://www.ncbi.nlm.nih.gov/pubmed/36157595
http://dx.doi.org/10.1093/pnasnexus/pgac165
_version_ 1784793328898801664
author Wang, Xinjun
Xu, Zhongli
Hu, Haoran
Zhou, Xueping
Zhang, Yanfu
Lafyatis, Robert
Chen, Kong
Huang, Heng
Ding, Ying
Duerr, Richard H
Chen, Wei
author_facet Wang, Xinjun
Xu, Zhongli
Hu, Haoran
Zhou, Xueping
Zhang, Yanfu
Lafyatis, Robert
Chen, Kong
Huang, Heng
Ding, Ying
Duerr, Richard H
Chen, Wei
author_sort Wang, Xinjun
collection PubMed
description The recent advance of single cell sequencing (scRNA-seq) technology such as Cellular Indexing of Transcriptomes and Epitopes by Sequencing (CITE-seq) allows researchers to quantify cell surface protein abundance and RNA expression simultaneously at single cell resolution. Although CITE-seq and other similar technologies have gained enormous popularity, novel methods for analyzing this type of single cell multi-omics data are in urgent need. A limited number of available tools utilize data-driven approach, which may undermine the biological importance of surface protein data. In this study, we developed SECANT, a biology-guided SEmi-supervised method for Clustering, classification, and ANnoTation of single-cell multi-omics. SECANT is used to analyze CITE-seq data, or jointly analyze CITE-seq and scRNA-seq data. The novelties of SECANT include (1) using confident cell type label identified from surface protein data as guidance for cell clustering, (2) providing general annotation of confident cell types for each cell cluster, (3) utilizing cells with uncertain or missing cell type label to increase performance, and (4) accurate prediction of confident cell types for scRNA-seq data. Besides, as a model-based approach, SECANT can quantify the uncertainty of the results through easily interpretable posterior probability, and our framework can be potentially extended to handle other types of multi-omics data. We successfully demonstrated the validity and advantages of SECANT via simulation studies and analysis of public and in-house datasets from multiple tissues. We believe this new method will be complementary to existing tools for characterizing novel cell types and make new biological discoveries using single-cell multi-omics data.
format Online
Article
Text
id pubmed-9491696
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-94916962022-09-22 SECANT: a biology-guided semi-supervised method for clustering, classification, and annotation of single-cell multi-omics Wang, Xinjun Xu, Zhongli Hu, Haoran Zhou, Xueping Zhang, Yanfu Lafyatis, Robert Chen, Kong Huang, Heng Ding, Ying Duerr, Richard H Chen, Wei PNAS Nexus Biological, Health, and Medical Sciences The recent advance of single cell sequencing (scRNA-seq) technology such as Cellular Indexing of Transcriptomes and Epitopes by Sequencing (CITE-seq) allows researchers to quantify cell surface protein abundance and RNA expression simultaneously at single cell resolution. Although CITE-seq and other similar technologies have gained enormous popularity, novel methods for analyzing this type of single cell multi-omics data are in urgent need. A limited number of available tools utilize data-driven approach, which may undermine the biological importance of surface protein data. In this study, we developed SECANT, a biology-guided SEmi-supervised method for Clustering, classification, and ANnoTation of single-cell multi-omics. SECANT is used to analyze CITE-seq data, or jointly analyze CITE-seq and scRNA-seq data. The novelties of SECANT include (1) using confident cell type label identified from surface protein data as guidance for cell clustering, (2) providing general annotation of confident cell types for each cell cluster, (3) utilizing cells with uncertain or missing cell type label to increase performance, and (4) accurate prediction of confident cell types for scRNA-seq data. Besides, as a model-based approach, SECANT can quantify the uncertainty of the results through easily interpretable posterior probability, and our framework can be potentially extended to handle other types of multi-omics data. We successfully demonstrated the validity and advantages of SECANT via simulation studies and analysis of public and in-house datasets from multiple tissues. We believe this new method will be complementary to existing tools for characterizing novel cell types and make new biological discoveries using single-cell multi-omics data. Oxford University Press 2022-08-19 /pmc/articles/PMC9491696/ /pubmed/36157595 http://dx.doi.org/10.1093/pnasnexus/pgac165 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of National Academy of Sciences. https://creativecommons.org/licenses/by-nc-nd/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs licence (https://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial reproduction and distribution of the work, in any medium, provided the original work is not altered or transformed in any way, and that the work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Biological, Health, and Medical Sciences
Wang, Xinjun
Xu, Zhongli
Hu, Haoran
Zhou, Xueping
Zhang, Yanfu
Lafyatis, Robert
Chen, Kong
Huang, Heng
Ding, Ying
Duerr, Richard H
Chen, Wei
SECANT: a biology-guided semi-supervised method for clustering, classification, and annotation of single-cell multi-omics
title SECANT: a biology-guided semi-supervised method for clustering, classification, and annotation of single-cell multi-omics
title_full SECANT: a biology-guided semi-supervised method for clustering, classification, and annotation of single-cell multi-omics
title_fullStr SECANT: a biology-guided semi-supervised method for clustering, classification, and annotation of single-cell multi-omics
title_full_unstemmed SECANT: a biology-guided semi-supervised method for clustering, classification, and annotation of single-cell multi-omics
title_short SECANT: a biology-guided semi-supervised method for clustering, classification, and annotation of single-cell multi-omics
title_sort secant: a biology-guided semi-supervised method for clustering, classification, and annotation of single-cell multi-omics
topic Biological, Health, and Medical Sciences
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9491696/
https://www.ncbi.nlm.nih.gov/pubmed/36157595
http://dx.doi.org/10.1093/pnasnexus/pgac165
work_keys_str_mv AT wangxinjun secantabiologyguidedsemisupervisedmethodforclusteringclassificationandannotationofsinglecellmultiomics
AT xuzhongli secantabiologyguidedsemisupervisedmethodforclusteringclassificationandannotationofsinglecellmultiomics
AT huhaoran secantabiologyguidedsemisupervisedmethodforclusteringclassificationandannotationofsinglecellmultiomics
AT zhouxueping secantabiologyguidedsemisupervisedmethodforclusteringclassificationandannotationofsinglecellmultiomics
AT zhangyanfu secantabiologyguidedsemisupervisedmethodforclusteringclassificationandannotationofsinglecellmultiomics
AT lafyatisrobert secantabiologyguidedsemisupervisedmethodforclusteringclassificationandannotationofsinglecellmultiomics
AT chenkong secantabiologyguidedsemisupervisedmethodforclusteringclassificationandannotationofsinglecellmultiomics
AT huangheng secantabiologyguidedsemisupervisedmethodforclusteringclassificationandannotationofsinglecellmultiomics
AT dingying secantabiologyguidedsemisupervisedmethodforclusteringclassificationandannotationofsinglecellmultiomics
AT duerrrichardh secantabiologyguidedsemisupervisedmethodforclusteringclassificationandannotationofsinglecellmultiomics
AT chenwei secantabiologyguidedsemisupervisedmethodforclusteringclassificationandannotationofsinglecellmultiomics