Cargando…

Variant calling enhances the identification of cancer cells in single-cell RNA sequencing data

Single-cell RNA-sequencing is an invaluable research tool that allows for the investigation of gene expression in heterogeneous cancer cell populations in ways that bulk RNA-seq cannot. However, normal (i.e., non tumor) cells in cancer samples have the potential to confound the downstream analysis o...

Descripción completa

Detalles Bibliográficos
Autores principales: Gasper, William, Rossi, Francesca, Ligorio, Matteo, Ghersi, Dario
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9560611/
https://www.ncbi.nlm.nih.gov/pubmed/36191033
http://dx.doi.org/10.1371/journal.pcbi.1010576
Descripción
Sumario:Single-cell RNA-sequencing is an invaluable research tool that allows for the investigation of gene expression in heterogeneous cancer cell populations in ways that bulk RNA-seq cannot. However, normal (i.e., non tumor) cells in cancer samples have the potential to confound the downstream analysis of single-cell RNA-seq data. Existing methods for identifying cancer and normal cells include copy number variation inference, marker-gene expression analysis, and expression-based clustering. This work aims to extend the existing approaches for identifying cancer cells in single-cell RNA-seq samples by incorporating variant calling and the identification of putative driver alterations. We found that putative driver alterations can be detected in single-cell RNA-seq data obtained with full-length transcript technologies and noticed that a subset of cells in tumor samples are enriched for putative driver alterations as compared to normal cells. Furthermore, we show that the number of putative driver alterations and inferred copy number variation are not correlated in all samples. Taken together, our findings suggest that augmenting existing cancer-cell filtering methods with variant calling and analysis can increase the number of tumor cells that can be confidently included in downstream analyses of single-cell full-length transcript RNA-seq datasets.