Cargando…

Pumping the brakes on RNA velocity by understanding and interpreting RNA velocity estimates

BACKGROUND: RNA velocity analysis of single cells offers the potential to predict temporal dynamics from gene expression. In many systems, RNA velocity has been observed to produce a vector field that qualitatively reflects known features of the system. However, the limitations of RNA velocity estim...

Descripción completa

Detalles Bibliográficos
Autores principales: Zheng, Shijie C., Stein-O’Brien, Genevieve, Boukas, Leandros, Goff, Loyal A., Hansen, Kasper D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10601342/
https://www.ncbi.nlm.nih.gov/pubmed/37885016
http://dx.doi.org/10.1186/s13059-023-03065-x
Descripción
Sumario:BACKGROUND: RNA velocity analysis of single cells offers the potential to predict temporal dynamics from gene expression. In many systems, RNA velocity has been observed to produce a vector field that qualitatively reflects known features of the system. However, the limitations of RNA velocity estimates are still not well understood. RESULTS: We analyze the impact of different steps in the RNA velocity workflow on direction and speed. We consider both high-dimensional velocity estimates and low-dimensional velocity vector fields mapped onto an embedding. We conclude the transition probability method for mapping velocity estimates onto an embedding is effectively interpolating in the embedding space. Our findings reveal a significant dependence of the RNA velocity workflow on smoothing via the k-nearest-neighbors (k-NN) graph of the observed data. This reliance results in considerable estimation errors for both direction and speed in both high- and low-dimensional settings when the k-NN graph fails to accurately represent the true data structure; this is an unknown feature of real data. RNA velocity performs poorly at estimating speed in both low- and high-dimensional spaces, except in very low noise settings. We introduce a novel quality measure that can identify when RNA velocity should not be used. CONCLUSIONS: Our findings emphasize the importance of choices in the RNA velocity workflow and highlight critical limitations of data analysis. We advise against over-interpreting expression dynamics using RNA velocity, particularly in terms of speed. Finally, we emphasize that the use of RNA velocity in assessing the correctness of a low-dimensional embedding is circular. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-023-03065-x.