Cargando…

Revealing lineage-related signals in single-cell gene expression using random matrix theory

Gene expression profiles of a cellular population, generated by single-cell RNA sequencing, contains rich information about biological state, including cell type, cell cycle phase, gene regulatory patterns, and location within the tissue of origin. A major challenge is to disentangle information abo...

Descripción completa

Detalles Bibliográficos
Autores principales: Nitzan, Mor, Brenner, Michael P.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: National Academy of Sciences 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7980374/
https://www.ncbi.nlm.nih.gov/pubmed/33836557
http://dx.doi.org/10.1073/pnas.1913931118
_version_ 1783667432947187712
author Nitzan, Mor
Brenner, Michael P.
author_facet Nitzan, Mor
Brenner, Michael P.
author_sort Nitzan, Mor
collection PubMed
description Gene expression profiles of a cellular population, generated by single-cell RNA sequencing, contains rich information about biological state, including cell type, cell cycle phase, gene regulatory patterns, and location within the tissue of origin. A major challenge is to disentangle information about these different biological states from each other, including distinguishing from cell lineage, since the correlation of cellular expression patterns is necessarily contaminated by ancestry. Here, we use a recent advance in random matrix theory, discovered in the context of protein phylogeny, to identify differentiation or ancestry-related processes in single-cell data. Qin and Colwell [C. Qin, L. J. Colwell, Proc. Natl. Acad. Sci. U.S.A. 115, 690–695 (2018)] showed that ancestral relationships in protein sequences create a power-law signature in the covariance eigenvalue distribution. We demonstrate the existence of such signatures in scRNA-seq data and that the genes driving them are indeed related to differentiation and developmental pathways. We predict the existence of similar power-law signatures for cells along linear trajectories and demonstrate this for linearly differentiating systems. Furthermore, we generalize to show that the same signatures can arise for cells along tissue-specific spatial trajectories. We illustrate these principles in diverse tissues and organisms, including the mammalian epidermis and lung, Drosophila whole-embryo, adult Hydra, dendritic cells, the intestinal epithelium, and cells undergoing induced pluripotent stem cells (iPSC) reprogramming. We show how these results can be used to interpret the gradual dynamics of lineage structure along iPSC reprogramming. Together, we provide a framework that can be used to identify signatures of specific biological processes in single-cell data without prior knowledge and identify candidate genes associated with these processes.
format Online
Article
Text
id pubmed-7980374
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher National Academy of Sciences
record_format MEDLINE/PubMed
spelling pubmed-79803742021-03-26 Revealing lineage-related signals in single-cell gene expression using random matrix theory Nitzan, Mor Brenner, Michael P. Proc Natl Acad Sci U S A Physical Sciences Gene expression profiles of a cellular population, generated by single-cell RNA sequencing, contains rich information about biological state, including cell type, cell cycle phase, gene regulatory patterns, and location within the tissue of origin. A major challenge is to disentangle information about these different biological states from each other, including distinguishing from cell lineage, since the correlation of cellular expression patterns is necessarily contaminated by ancestry. Here, we use a recent advance in random matrix theory, discovered in the context of protein phylogeny, to identify differentiation or ancestry-related processes in single-cell data. Qin and Colwell [C. Qin, L. J. Colwell, Proc. Natl. Acad. Sci. U.S.A. 115, 690–695 (2018)] showed that ancestral relationships in protein sequences create a power-law signature in the covariance eigenvalue distribution. We demonstrate the existence of such signatures in scRNA-seq data and that the genes driving them are indeed related to differentiation and developmental pathways. We predict the existence of similar power-law signatures for cells along linear trajectories and demonstrate this for linearly differentiating systems. Furthermore, we generalize to show that the same signatures can arise for cells along tissue-specific spatial trajectories. We illustrate these principles in diverse tissues and organisms, including the mammalian epidermis and lung, Drosophila whole-embryo, adult Hydra, dendritic cells, the intestinal epithelium, and cells undergoing induced pluripotent stem cells (iPSC) reprogramming. We show how these results can be used to interpret the gradual dynamics of lineage structure along iPSC reprogramming. Together, we provide a framework that can be used to identify signatures of specific biological processes in single-cell data without prior knowledge and identify candidate genes associated with these processes. National Academy of Sciences 2021-03-16 2021-03-08 /pmc/articles/PMC7980374/ /pubmed/33836557 http://dx.doi.org/10.1073/pnas.1913931118 Text en Copyright © 2021 the Author(s). Published by PNAS. https://creativecommons.org/licenses/by-nc-nd/4.0/ https://creativecommons.org/licenses/by-nc-nd/4.0/This open access article is distributed under Creative Commons Attribution-NonCommercial-NoDerivatives License 4.0 (CC BY-NC-ND) (https://creativecommons.org/licenses/by-nc-nd/4.0/) .
spellingShingle Physical Sciences
Nitzan, Mor
Brenner, Michael P.
Revealing lineage-related signals in single-cell gene expression using random matrix theory
title Revealing lineage-related signals in single-cell gene expression using random matrix theory
title_full Revealing lineage-related signals in single-cell gene expression using random matrix theory
title_fullStr Revealing lineage-related signals in single-cell gene expression using random matrix theory
title_full_unstemmed Revealing lineage-related signals in single-cell gene expression using random matrix theory
title_short Revealing lineage-related signals in single-cell gene expression using random matrix theory
title_sort revealing lineage-related signals in single-cell gene expression using random matrix theory
topic Physical Sciences
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7980374/
https://www.ncbi.nlm.nih.gov/pubmed/33836557
http://dx.doi.org/10.1073/pnas.1913931118
work_keys_str_mv AT nitzanmor revealinglineagerelatedsignalsinsinglecellgeneexpressionusingrandommatrixtheory
AT brennermichaelp revealinglineagerelatedsignalsinsinglecellgeneexpressionusingrandommatrixtheory