Cargando…

Supervised learning of high-confidence phenotypic subpopulations from single-cell data

Accurately identifying phenotype-relevant cell subsets from heterogeneous cell populations is crucial for delineating the underlying mechanisms driving biological or clinical phenotypes. Here, by deploying a learning with rejection strategy, we developed a novel supervised learning framework called...

Descripción completa

Detalles Bibliográficos
Autores principales: Ren, Tao, Chen, Canping, Danilov, Alexey V., Liu, Susan, Guan, Xiangnan, Du, Shunyi, Wu, Xiwei, Sherman, Mara H., Spellman, Paul T., Coussens, Lisa M., Adey, Andrew C., Mills, Gordon B., Wu, Ling-Yun, Xia, Zheng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10055361/
https://www.ncbi.nlm.nih.gov/pubmed/36993424
http://dx.doi.org/10.1101/2023.03.23.533712
_version_ 1785015862964518912
author Ren, Tao
Chen, Canping
Danilov, Alexey V.
Liu, Susan
Guan, Xiangnan
Du, Shunyi
Wu, Xiwei
Sherman, Mara H.
Spellman, Paul T.
Coussens, Lisa M.
Adey, Andrew C.
Mills, Gordon B.
Wu, Ling-Yun
Xia, Zheng
author_facet Ren, Tao
Chen, Canping
Danilov, Alexey V.
Liu, Susan
Guan, Xiangnan
Du, Shunyi
Wu, Xiwei
Sherman, Mara H.
Spellman, Paul T.
Coussens, Lisa M.
Adey, Andrew C.
Mills, Gordon B.
Wu, Ling-Yun
Xia, Zheng
author_sort Ren, Tao
collection PubMed
description Accurately identifying phenotype-relevant cell subsets from heterogeneous cell populations is crucial for delineating the underlying mechanisms driving biological or clinical phenotypes. Here, by deploying a learning with rejection strategy, we developed a novel supervised learning framework called PENCIL to identify subpopulations associated with categorical or continuous phenotypes from single-cell data. By embedding a feature selection function into this flexible framework, for the first time, we were able to select informative features and identify cell subpopulations simultaneously, which enables the accurate identification of phenotypic subpopulations otherwise missed by methods incapable of concurrent gene selection. Furthermore, the regression mode of PENCIL presents a novel ability for supervised phenotypic trajectory learning of subpopulations from single-cell data. We conducted comprehensive simulations to evaluate PENCIĽs versatility in simultaneous gene selection, subpopulation identification and phenotypic trajectory prediction. PENCIL is fast and scalable to analyze 1 million cells within 1 hour. Using the classification mode, PENCIL detected T-cell subpopulations associated with melanoma immunotherapy outcomes. Moreover, when applied to scRNA-seq of a mantle cell lymphoma patient with drug treatment across multiple time points, the regression mode of PENCIL revealed a transcriptional treatment response trajectory. Collectively, our work introduces a scalable and flexible infrastructure to accurately identify phenotype-associated subpopulations from single-cell data.
format Online
Article
Text
id pubmed-10055361
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-100553612023-03-30 Supervised learning of high-confidence phenotypic subpopulations from single-cell data Ren, Tao Chen, Canping Danilov, Alexey V. Liu, Susan Guan, Xiangnan Du, Shunyi Wu, Xiwei Sherman, Mara H. Spellman, Paul T. Coussens, Lisa M. Adey, Andrew C. Mills, Gordon B. Wu, Ling-Yun Xia, Zheng bioRxiv Article Accurately identifying phenotype-relevant cell subsets from heterogeneous cell populations is crucial for delineating the underlying mechanisms driving biological or clinical phenotypes. Here, by deploying a learning with rejection strategy, we developed a novel supervised learning framework called PENCIL to identify subpopulations associated with categorical or continuous phenotypes from single-cell data. By embedding a feature selection function into this flexible framework, for the first time, we were able to select informative features and identify cell subpopulations simultaneously, which enables the accurate identification of phenotypic subpopulations otherwise missed by methods incapable of concurrent gene selection. Furthermore, the regression mode of PENCIL presents a novel ability for supervised phenotypic trajectory learning of subpopulations from single-cell data. We conducted comprehensive simulations to evaluate PENCIĽs versatility in simultaneous gene selection, subpopulation identification and phenotypic trajectory prediction. PENCIL is fast and scalable to analyze 1 million cells within 1 hour. Using the classification mode, PENCIL detected T-cell subpopulations associated with melanoma immunotherapy outcomes. Moreover, when applied to scRNA-seq of a mantle cell lymphoma patient with drug treatment across multiple time points, the regression mode of PENCIL revealed a transcriptional treatment response trajectory. Collectively, our work introduces a scalable and flexible infrastructure to accurately identify phenotype-associated subpopulations from single-cell data. Cold Spring Harbor Laboratory 2023-03-25 /pmc/articles/PMC10055361/ /pubmed/36993424 http://dx.doi.org/10.1101/2023.03.23.533712 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator.
spellingShingle Article
Ren, Tao
Chen, Canping
Danilov, Alexey V.
Liu, Susan
Guan, Xiangnan
Du, Shunyi
Wu, Xiwei
Sherman, Mara H.
Spellman, Paul T.
Coussens, Lisa M.
Adey, Andrew C.
Mills, Gordon B.
Wu, Ling-Yun
Xia, Zheng
Supervised learning of high-confidence phenotypic subpopulations from single-cell data
title Supervised learning of high-confidence phenotypic subpopulations from single-cell data
title_full Supervised learning of high-confidence phenotypic subpopulations from single-cell data
title_fullStr Supervised learning of high-confidence phenotypic subpopulations from single-cell data
title_full_unstemmed Supervised learning of high-confidence phenotypic subpopulations from single-cell data
title_short Supervised learning of high-confidence phenotypic subpopulations from single-cell data
title_sort supervised learning of high-confidence phenotypic subpopulations from single-cell data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10055361/
https://www.ncbi.nlm.nih.gov/pubmed/36993424
http://dx.doi.org/10.1101/2023.03.23.533712
work_keys_str_mv AT rentao supervisedlearningofhighconfidencephenotypicsubpopulationsfromsinglecelldata
AT chencanping supervisedlearningofhighconfidencephenotypicsubpopulationsfromsinglecelldata
AT danilovalexeyv supervisedlearningofhighconfidencephenotypicsubpopulationsfromsinglecelldata
AT liususan supervisedlearningofhighconfidencephenotypicsubpopulationsfromsinglecelldata
AT guanxiangnan supervisedlearningofhighconfidencephenotypicsubpopulationsfromsinglecelldata
AT dushunyi supervisedlearningofhighconfidencephenotypicsubpopulationsfromsinglecelldata
AT wuxiwei supervisedlearningofhighconfidencephenotypicsubpopulationsfromsinglecelldata
AT shermanmarah supervisedlearningofhighconfidencephenotypicsubpopulationsfromsinglecelldata
AT spellmanpault supervisedlearningofhighconfidencephenotypicsubpopulationsfromsinglecelldata
AT coussenslisam supervisedlearningofhighconfidencephenotypicsubpopulationsfromsinglecelldata
AT adeyandrewc supervisedlearningofhighconfidencephenotypicsubpopulationsfromsinglecelldata
AT millsgordonb supervisedlearningofhighconfidencephenotypicsubpopulationsfromsinglecelldata
AT wulingyun supervisedlearningofhighconfidencephenotypicsubpopulationsfromsinglecelldata
AT xiazheng supervisedlearningofhighconfidencephenotypicsubpopulationsfromsinglecelldata