Cargando…
Supervised learning of high-confidence phenotypic subpopulations from single-cell data
Accurately identifying phenotype-relevant cell subsets from heterogeneous cell populations is crucial for delineating the underlying mechanisms driving biological or clinical phenotypes. Here, by deploying a learning with rejection strategy, we developed a novel supervised learning framework called...
Autores principales: | , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10055361/ https://www.ncbi.nlm.nih.gov/pubmed/36993424 http://dx.doi.org/10.1101/2023.03.23.533712 |
_version_ | 1785015862964518912 |
---|---|
author | Ren, Tao Chen, Canping Danilov, Alexey V. Liu, Susan Guan, Xiangnan Du, Shunyi Wu, Xiwei Sherman, Mara H. Spellman, Paul T. Coussens, Lisa M. Adey, Andrew C. Mills, Gordon B. Wu, Ling-Yun Xia, Zheng |
author_facet | Ren, Tao Chen, Canping Danilov, Alexey V. Liu, Susan Guan, Xiangnan Du, Shunyi Wu, Xiwei Sherman, Mara H. Spellman, Paul T. Coussens, Lisa M. Adey, Andrew C. Mills, Gordon B. Wu, Ling-Yun Xia, Zheng |
author_sort | Ren, Tao |
collection | PubMed |
description | Accurately identifying phenotype-relevant cell subsets from heterogeneous cell populations is crucial for delineating the underlying mechanisms driving biological or clinical phenotypes. Here, by deploying a learning with rejection strategy, we developed a novel supervised learning framework called PENCIL to identify subpopulations associated with categorical or continuous phenotypes from single-cell data. By embedding a feature selection function into this flexible framework, for the first time, we were able to select informative features and identify cell subpopulations simultaneously, which enables the accurate identification of phenotypic subpopulations otherwise missed by methods incapable of concurrent gene selection. Furthermore, the regression mode of PENCIL presents a novel ability for supervised phenotypic trajectory learning of subpopulations from single-cell data. We conducted comprehensive simulations to evaluate PENCIĽs versatility in simultaneous gene selection, subpopulation identification and phenotypic trajectory prediction. PENCIL is fast and scalable to analyze 1 million cells within 1 hour. Using the classification mode, PENCIL detected T-cell subpopulations associated with melanoma immunotherapy outcomes. Moreover, when applied to scRNA-seq of a mantle cell lymphoma patient with drug treatment across multiple time points, the regression mode of PENCIL revealed a transcriptional treatment response trajectory. Collectively, our work introduces a scalable and flexible infrastructure to accurately identify phenotype-associated subpopulations from single-cell data. |
format | Online Article Text |
id | pubmed-10055361 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Cold Spring Harbor Laboratory |
record_format | MEDLINE/PubMed |
spelling | pubmed-100553612023-03-30 Supervised learning of high-confidence phenotypic subpopulations from single-cell data Ren, Tao Chen, Canping Danilov, Alexey V. Liu, Susan Guan, Xiangnan Du, Shunyi Wu, Xiwei Sherman, Mara H. Spellman, Paul T. Coussens, Lisa M. Adey, Andrew C. Mills, Gordon B. Wu, Ling-Yun Xia, Zheng bioRxiv Article Accurately identifying phenotype-relevant cell subsets from heterogeneous cell populations is crucial for delineating the underlying mechanisms driving biological or clinical phenotypes. Here, by deploying a learning with rejection strategy, we developed a novel supervised learning framework called PENCIL to identify subpopulations associated with categorical or continuous phenotypes from single-cell data. By embedding a feature selection function into this flexible framework, for the first time, we were able to select informative features and identify cell subpopulations simultaneously, which enables the accurate identification of phenotypic subpopulations otherwise missed by methods incapable of concurrent gene selection. Furthermore, the regression mode of PENCIL presents a novel ability for supervised phenotypic trajectory learning of subpopulations from single-cell data. We conducted comprehensive simulations to evaluate PENCIĽs versatility in simultaneous gene selection, subpopulation identification and phenotypic trajectory prediction. PENCIL is fast and scalable to analyze 1 million cells within 1 hour. Using the classification mode, PENCIL detected T-cell subpopulations associated with melanoma immunotherapy outcomes. Moreover, when applied to scRNA-seq of a mantle cell lymphoma patient with drug treatment across multiple time points, the regression mode of PENCIL revealed a transcriptional treatment response trajectory. Collectively, our work introduces a scalable and flexible infrastructure to accurately identify phenotype-associated subpopulations from single-cell data. Cold Spring Harbor Laboratory 2023-03-25 /pmc/articles/PMC10055361/ /pubmed/36993424 http://dx.doi.org/10.1101/2023.03.23.533712 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator. |
spellingShingle | Article Ren, Tao Chen, Canping Danilov, Alexey V. Liu, Susan Guan, Xiangnan Du, Shunyi Wu, Xiwei Sherman, Mara H. Spellman, Paul T. Coussens, Lisa M. Adey, Andrew C. Mills, Gordon B. Wu, Ling-Yun Xia, Zheng Supervised learning of high-confidence phenotypic subpopulations from single-cell data |
title | Supervised learning of high-confidence phenotypic subpopulations from single-cell data |
title_full | Supervised learning of high-confidence phenotypic subpopulations from single-cell data |
title_fullStr | Supervised learning of high-confidence phenotypic subpopulations from single-cell data |
title_full_unstemmed | Supervised learning of high-confidence phenotypic subpopulations from single-cell data |
title_short | Supervised learning of high-confidence phenotypic subpopulations from single-cell data |
title_sort | supervised learning of high-confidence phenotypic subpopulations from single-cell data |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10055361/ https://www.ncbi.nlm.nih.gov/pubmed/36993424 http://dx.doi.org/10.1101/2023.03.23.533712 |
work_keys_str_mv | AT rentao supervisedlearningofhighconfidencephenotypicsubpopulationsfromsinglecelldata AT chencanping supervisedlearningofhighconfidencephenotypicsubpopulationsfromsinglecelldata AT danilovalexeyv supervisedlearningofhighconfidencephenotypicsubpopulationsfromsinglecelldata AT liususan supervisedlearningofhighconfidencephenotypicsubpopulationsfromsinglecelldata AT guanxiangnan supervisedlearningofhighconfidencephenotypicsubpopulationsfromsinglecelldata AT dushunyi supervisedlearningofhighconfidencephenotypicsubpopulationsfromsinglecelldata AT wuxiwei supervisedlearningofhighconfidencephenotypicsubpopulationsfromsinglecelldata AT shermanmarah supervisedlearningofhighconfidencephenotypicsubpopulationsfromsinglecelldata AT spellmanpault supervisedlearningofhighconfidencephenotypicsubpopulationsfromsinglecelldata AT coussenslisam supervisedlearningofhighconfidencephenotypicsubpopulationsfromsinglecelldata AT adeyandrewc supervisedlearningofhighconfidencephenotypicsubpopulationsfromsinglecelldata AT millsgordonb supervisedlearningofhighconfidencephenotypicsubpopulationsfromsinglecelldata AT wulingyun supervisedlearningofhighconfidencephenotypicsubpopulationsfromsinglecelldata AT xiazheng supervisedlearningofhighconfidencephenotypicsubpopulationsfromsinglecelldata |