Cargando…

Evaluation of zero counts to better understand the discrepancies between bulk and single-cell RNA-Seq platforms

Recent advances in sample preparation and sequencing technology have made it possible to profile the transcriptomes of individual cells using single-cell RNA sequencing (scRNA-Seq). Compared to bulk RNA-Seq data, single-cell data often contain a higher percentage of zero reads, mainly due to lower s...

Descripción completa

Detalles Bibliográficos
Autores principales: Zyla, Joanna, Papiez, Anna, Zhao, Jun, Qu, Rihao, Li, Xiaotong, Kluger, Yuval, Polanska, Joanna, Hatzis, Christos, Pusztai, Lajos, Marczyk, Michal
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Research Network of Computational and Structural Biotechnology 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10568495/
https://www.ncbi.nlm.nih.gov/pubmed/37841335
http://dx.doi.org/10.1016/j.csbj.2023.09.035
Descripción
Sumario:Recent advances in sample preparation and sequencing technology have made it possible to profile the transcriptomes of individual cells using single-cell RNA sequencing (scRNA-Seq). Compared to bulk RNA-Seq data, single-cell data often contain a higher percentage of zero reads, mainly due to lower sequencing depth per cell, which affects mostly measurements of low-expression genes. However, discrepancies between platforms are observed regardless of expression level. Using four paired datasets with multiple samples each, we investigated technical and biological factors that can contribute to this expression shift. Using two separate machine learning models we found that, in addition to expression level, RNA integrity, gene or UTR3 length, and the number of transcripts potentially also influence the occurrence of zeros. These findings could enable the development of novel analytical methods for cross-platform expression shift correction. We also identified genes and biological pathways in our diverse datasets that consistently showed differences when assessed at the single cell versus bulk level to assist in interpreting analysis across transcriptomic platforms. At the gene level, 25 genes (0.12%) were found in all datasets as discordant, but at the pathway level, 7 pathways (2.02%) showed shared enrichment in discordant genes.