Cargando…

Evaluation of single-cell RNAseq labelling algorithms using cancer datasets

Single-cell RNA sequencing (scRNA-seq) clustering and labelling methods are used to determine precise cellular composition of tissue samples. Automated labelling methods rely on either unsupervised, cluster-based approaches or supervised, cell-based approaches to identify cell types. The high comple...

Descripción completa

Detalles Bibliográficos
Autores principales: Christensen, Erik, Luo, Ping, Turinsky, Andrei, Husić, Mia, Mahalanabis, Alaina, Naidas, Alaine, Diaz-Mejia, Juan Javier, Brudno, Michael, Pugh, Trevor, Ramani, Arun, Shooshtari, Parisa
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9851326/
https://www.ncbi.nlm.nih.gov/pubmed/36585784
http://dx.doi.org/10.1093/bib/bbac561
_version_ 1784872371562217472
author Christensen, Erik
Luo, Ping
Turinsky, Andrei
Husić, Mia
Mahalanabis, Alaina
Naidas, Alaine
Diaz-Mejia, Juan Javier
Brudno, Michael
Pugh, Trevor
Ramani, Arun
Shooshtari, Parisa
author_facet Christensen, Erik
Luo, Ping
Turinsky, Andrei
Husić, Mia
Mahalanabis, Alaina
Naidas, Alaine
Diaz-Mejia, Juan Javier
Brudno, Michael
Pugh, Trevor
Ramani, Arun
Shooshtari, Parisa
author_sort Christensen, Erik
collection PubMed
description Single-cell RNA sequencing (scRNA-seq) clustering and labelling methods are used to determine precise cellular composition of tissue samples. Automated labelling methods rely on either unsupervised, cluster-based approaches or supervised, cell-based approaches to identify cell types. The high complexity of cancer poses a unique challenge, as tumor microenvironments are often composed of diverse cell subpopulations with unique functional effects that may lead to disease progression, metastasis and treatment resistance. Here, we assess 17 cell-based and 9 cluster-based scRNA-seq labelling algorithms using 8 cancer datasets, providing a comprehensive large-scale assessment of such methods in a cancer-specific context. Using several performance metrics, we show that cell-based methods generally achieved higher performance and were faster compared to cluster-based methods. Cluster-based methods more successfully labelled non-malignant cell types, likely because of a lack of gene signatures for relevant malignant cell subpopulations. Larger cell numbers present in some cell types in training data positively impacted prediction scores for cell-based methods. Finally, we examined which methods performed favorably when trained and tested on separate patient cohorts in scenarios similar to clinical applications, and which were able to accurately label particularly small or under-represented cell populations in the given datasets. We conclude that scPred and SVM show the best overall performances with cancer-specific data and provide further suggestions for algorithm selection. Our analysis pipeline for assessing the performance of cell type labelling algorithms is available in https://github.com/shooshtarilab/scRNAseq-Automated-Cell-Type-Labelling.
format Online
Article
Text
id pubmed-9851326
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-98513262023-01-20 Evaluation of single-cell RNAseq labelling algorithms using cancer datasets Christensen, Erik Luo, Ping Turinsky, Andrei Husić, Mia Mahalanabis, Alaina Naidas, Alaine Diaz-Mejia, Juan Javier Brudno, Michael Pugh, Trevor Ramani, Arun Shooshtari, Parisa Brief Bioinform Review Single-cell RNA sequencing (scRNA-seq) clustering and labelling methods are used to determine precise cellular composition of tissue samples. Automated labelling methods rely on either unsupervised, cluster-based approaches or supervised, cell-based approaches to identify cell types. The high complexity of cancer poses a unique challenge, as tumor microenvironments are often composed of diverse cell subpopulations with unique functional effects that may lead to disease progression, metastasis and treatment resistance. Here, we assess 17 cell-based and 9 cluster-based scRNA-seq labelling algorithms using 8 cancer datasets, providing a comprehensive large-scale assessment of such methods in a cancer-specific context. Using several performance metrics, we show that cell-based methods generally achieved higher performance and were faster compared to cluster-based methods. Cluster-based methods more successfully labelled non-malignant cell types, likely because of a lack of gene signatures for relevant malignant cell subpopulations. Larger cell numbers present in some cell types in training data positively impacted prediction scores for cell-based methods. Finally, we examined which methods performed favorably when trained and tested on separate patient cohorts in scenarios similar to clinical applications, and which were able to accurately label particularly small or under-represented cell populations in the given datasets. We conclude that scPred and SVM show the best overall performances with cancer-specific data and provide further suggestions for algorithm selection. Our analysis pipeline for assessing the performance of cell type labelling algorithms is available in https://github.com/shooshtarilab/scRNAseq-Automated-Cell-Type-Labelling. Oxford University Press 2022-12-30 /pmc/articles/PMC9851326/ /pubmed/36585784 http://dx.doi.org/10.1093/bib/bbac561 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Review
Christensen, Erik
Luo, Ping
Turinsky, Andrei
Husić, Mia
Mahalanabis, Alaina
Naidas, Alaine
Diaz-Mejia, Juan Javier
Brudno, Michael
Pugh, Trevor
Ramani, Arun
Shooshtari, Parisa
Evaluation of single-cell RNAseq labelling algorithms using cancer datasets
title Evaluation of single-cell RNAseq labelling algorithms using cancer datasets
title_full Evaluation of single-cell RNAseq labelling algorithms using cancer datasets
title_fullStr Evaluation of single-cell RNAseq labelling algorithms using cancer datasets
title_full_unstemmed Evaluation of single-cell RNAseq labelling algorithms using cancer datasets
title_short Evaluation of single-cell RNAseq labelling algorithms using cancer datasets
title_sort evaluation of single-cell rnaseq labelling algorithms using cancer datasets
topic Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9851326/
https://www.ncbi.nlm.nih.gov/pubmed/36585784
http://dx.doi.org/10.1093/bib/bbac561
work_keys_str_mv AT christensenerik evaluationofsinglecellrnaseqlabellingalgorithmsusingcancerdatasets
AT luoping evaluationofsinglecellrnaseqlabellingalgorithmsusingcancerdatasets
AT turinskyandrei evaluationofsinglecellrnaseqlabellingalgorithmsusingcancerdatasets
AT husicmia evaluationofsinglecellrnaseqlabellingalgorithmsusingcancerdatasets
AT mahalanabisalaina evaluationofsinglecellrnaseqlabellingalgorithmsusingcancerdatasets
AT naidasalaine evaluationofsinglecellrnaseqlabellingalgorithmsusingcancerdatasets
AT diazmejiajuanjavier evaluationofsinglecellrnaseqlabellingalgorithmsusingcancerdatasets
AT brudnomichael evaluationofsinglecellrnaseqlabellingalgorithmsusingcancerdatasets
AT pughtrevor evaluationofsinglecellrnaseqlabellingalgorithmsusingcancerdatasets
AT ramaniarun evaluationofsinglecellrnaseqlabellingalgorithmsusingcancerdatasets
AT shooshtariparisa evaluationofsinglecellrnaseqlabellingalgorithmsusingcancerdatasets