Cargando…

Identifying phenotype-associated subpopulations by integrating bulk and single-cell sequencing data

Single-cell RNA sequencing distinguishes cell types, states, and lineages within the context of heterogeneous tissues. However current single-cell data cannot directly link cell clusters with specific phenotypes. Here we present Scissor, a method that identifies cell subpopulations from single-cell...

Descripción completa

Detalles Bibliográficos
Autores principales: Sun, Duanchen, Guan, Xiangnan, Moran, Amy E., Wu, Ling-Yun, Qian, David Z., Schedin, Pepper, Dai, Mu-Shui, Danilov, Alexey V., Alumkal, Joshi J., Adey, Andrew C., Spellman, Paul T., Xia, Zheng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9010342/
https://www.ncbi.nlm.nih.gov/pubmed/34764492
http://dx.doi.org/10.1038/s41587-021-01091-3
_version_ 1784687462053838848
author Sun, Duanchen
Guan, Xiangnan
Moran, Amy E.
Wu, Ling-Yun
Qian, David Z.
Schedin, Pepper
Dai, Mu-Shui
Danilov, Alexey V.
Alumkal, Joshi J.
Adey, Andrew C.
Spellman, Paul T.
Xia, Zheng
author_facet Sun, Duanchen
Guan, Xiangnan
Moran, Amy E.
Wu, Ling-Yun
Qian, David Z.
Schedin, Pepper
Dai, Mu-Shui
Danilov, Alexey V.
Alumkal, Joshi J.
Adey, Andrew C.
Spellman, Paul T.
Xia, Zheng
author_sort Sun, Duanchen
collection PubMed
description Single-cell RNA sequencing distinguishes cell types, states, and lineages within the context of heterogeneous tissues. However current single-cell data cannot directly link cell clusters with specific phenotypes. Here we present Scissor, a method that identifies cell subpopulations from single-cell data that are associated with a given phenotype. Scissor integrates phenotype-associated bulk expression data and single-cell data by first quantifying the similarity between each single cell and each bulk sample. It then optimizes a regression model on the correlation matrix with the sample phenotype to identify relevant subpopulations. Applied to a lung cancer single-cell RNA-seq dataset, Scissor identified subsets of cells associated with worse survival and with TP53 mutations. In melanoma, Scissor discerned a T cell subpopulation with low PDCD1/CTLA4 and high TCF7 expression associated with an immunotherapy response. Beyond cancer, Scissor was effective in interpreting Facioscapulohumeral muscular dystrophy (FSHD) and Alzheimer’s disease datasets. Scissor identifies biologically and clinically relevant cell subpopulations from single-cell assays by leveraging phenotype and bulk-omics datasets.
format Online
Article
Text
id pubmed-9010342
institution National Center for Biotechnology Information
language English
publishDate 2022
record_format MEDLINE/PubMed
spelling pubmed-90103422022-05-11 Identifying phenotype-associated subpopulations by integrating bulk and single-cell sequencing data Sun, Duanchen Guan, Xiangnan Moran, Amy E. Wu, Ling-Yun Qian, David Z. Schedin, Pepper Dai, Mu-Shui Danilov, Alexey V. Alumkal, Joshi J. Adey, Andrew C. Spellman, Paul T. Xia, Zheng Nat Biotechnol Article Single-cell RNA sequencing distinguishes cell types, states, and lineages within the context of heterogeneous tissues. However current single-cell data cannot directly link cell clusters with specific phenotypes. Here we present Scissor, a method that identifies cell subpopulations from single-cell data that are associated with a given phenotype. Scissor integrates phenotype-associated bulk expression data and single-cell data by first quantifying the similarity between each single cell and each bulk sample. It then optimizes a regression model on the correlation matrix with the sample phenotype to identify relevant subpopulations. Applied to a lung cancer single-cell RNA-seq dataset, Scissor identified subsets of cells associated with worse survival and with TP53 mutations. In melanoma, Scissor discerned a T cell subpopulation with low PDCD1/CTLA4 and high TCF7 expression associated with an immunotherapy response. Beyond cancer, Scissor was effective in interpreting Facioscapulohumeral muscular dystrophy (FSHD) and Alzheimer’s disease datasets. Scissor identifies biologically and clinically relevant cell subpopulations from single-cell assays by leveraging phenotype and bulk-omics datasets. 2022-04 2021-11-11 /pmc/articles/PMC9010342/ /pubmed/34764492 http://dx.doi.org/10.1038/s41587-021-01091-3 Text en Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use: https://www.springernature.com/gp/open-research/policies/accepted-manuscript-terms
spellingShingle Article
Sun, Duanchen
Guan, Xiangnan
Moran, Amy E.
Wu, Ling-Yun
Qian, David Z.
Schedin, Pepper
Dai, Mu-Shui
Danilov, Alexey V.
Alumkal, Joshi J.
Adey, Andrew C.
Spellman, Paul T.
Xia, Zheng
Identifying phenotype-associated subpopulations by integrating bulk and single-cell sequencing data
title Identifying phenotype-associated subpopulations by integrating bulk and single-cell sequencing data
title_full Identifying phenotype-associated subpopulations by integrating bulk and single-cell sequencing data
title_fullStr Identifying phenotype-associated subpopulations by integrating bulk and single-cell sequencing data
title_full_unstemmed Identifying phenotype-associated subpopulations by integrating bulk and single-cell sequencing data
title_short Identifying phenotype-associated subpopulations by integrating bulk and single-cell sequencing data
title_sort identifying phenotype-associated subpopulations by integrating bulk and single-cell sequencing data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9010342/
https://www.ncbi.nlm.nih.gov/pubmed/34764492
http://dx.doi.org/10.1038/s41587-021-01091-3
work_keys_str_mv AT sunduanchen identifyingphenotypeassociatedsubpopulationsbyintegratingbulkandsinglecellsequencingdata
AT guanxiangnan identifyingphenotypeassociatedsubpopulationsbyintegratingbulkandsinglecellsequencingdata
AT moranamye identifyingphenotypeassociatedsubpopulationsbyintegratingbulkandsinglecellsequencingdata
AT wulingyun identifyingphenotypeassociatedsubpopulationsbyintegratingbulkandsinglecellsequencingdata
AT qiandavidz identifyingphenotypeassociatedsubpopulationsbyintegratingbulkandsinglecellsequencingdata
AT schedinpepper identifyingphenotypeassociatedsubpopulationsbyintegratingbulkandsinglecellsequencingdata
AT daimushui identifyingphenotypeassociatedsubpopulationsbyintegratingbulkandsinglecellsequencingdata
AT danilovalexeyv identifyingphenotypeassociatedsubpopulationsbyintegratingbulkandsinglecellsequencingdata
AT alumkaljoshij identifyingphenotypeassociatedsubpopulationsbyintegratingbulkandsinglecellsequencingdata
AT adeyandrewc identifyingphenotypeassociatedsubpopulationsbyintegratingbulkandsinglecellsequencingdata
AT spellmanpault identifyingphenotypeassociatedsubpopulationsbyintegratingbulkandsinglecellsequencingdata
AT xiazheng identifyingphenotypeassociatedsubpopulationsbyintegratingbulkandsinglecellsequencingdata