Cargando…
Multiscale Methods for Signal Selection in Single-Cell Data
Analysis of single-cell transcriptomics often relies on clustering cells and then performing differential gene expression (DGE) to identify genes that vary between these clusters. These discrete analyses successfully determine cell types and markers; however, continuous variation within and between...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9407339/ https://www.ncbi.nlm.nih.gov/pubmed/36010781 http://dx.doi.org/10.3390/e24081116 |
_version_ | 1784774340236017664 |
---|---|
author | Hoekzema, Renee S. Marsh, Lewis Sumray, Otto Carroll, Thomas M. Lu, Xin Byrne, Helen M. Harrington, Heather A. |
author_facet | Hoekzema, Renee S. Marsh, Lewis Sumray, Otto Carroll, Thomas M. Lu, Xin Byrne, Helen M. Harrington, Heather A. |
author_sort | Hoekzema, Renee S. |
collection | PubMed |
description | Analysis of single-cell transcriptomics often relies on clustering cells and then performing differential gene expression (DGE) to identify genes that vary between these clusters. These discrete analyses successfully determine cell types and markers; however, continuous variation within and between cell types may not be detected. We propose three topologically motivated mathematical methods for unsupervised feature selection that consider discrete and continuous transcriptional patterns on an equal footing across multiple scales simultaneously. Eigenscores ([Formula: see text]) rank signals or genes based on their correspondence to low-frequency intrinsic patterning in the data using the spectral decomposition of the Laplacian graph. The multiscale Laplacian score (MLS) is an unsupervised method for locating relevant scales in data and selecting the genes that are coherently expressed at these respective scales. The persistent Rayleigh quotient (PRQ) takes data equipped with a filtration, allowing the separation of genes with different roles in a bifurcation process (e.g., pseudo-time). We demonstrate the utility of these techniques by applying them to published single-cell transcriptomics data sets. The methods validate previously identified genes and detect additional biologically meaningful genes with coherent expression patterns. By studying the interaction between gene signals and the geometry of the underlying space, the three methods give multidimensional rankings of the genes and visualisation of relationships between them. |
format | Online Article Text |
id | pubmed-9407339 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-94073392022-08-26 Multiscale Methods for Signal Selection in Single-Cell Data Hoekzema, Renee S. Marsh, Lewis Sumray, Otto Carroll, Thomas M. Lu, Xin Byrne, Helen M. Harrington, Heather A. Entropy (Basel) Article Analysis of single-cell transcriptomics often relies on clustering cells and then performing differential gene expression (DGE) to identify genes that vary between these clusters. These discrete analyses successfully determine cell types and markers; however, continuous variation within and between cell types may not be detected. We propose three topologically motivated mathematical methods for unsupervised feature selection that consider discrete and continuous transcriptional patterns on an equal footing across multiple scales simultaneously. Eigenscores ([Formula: see text]) rank signals or genes based on their correspondence to low-frequency intrinsic patterning in the data using the spectral decomposition of the Laplacian graph. The multiscale Laplacian score (MLS) is an unsupervised method for locating relevant scales in data and selecting the genes that are coherently expressed at these respective scales. The persistent Rayleigh quotient (PRQ) takes data equipped with a filtration, allowing the separation of genes with different roles in a bifurcation process (e.g., pseudo-time). We demonstrate the utility of these techniques by applying them to published single-cell transcriptomics data sets. The methods validate previously identified genes and detect additional biologically meaningful genes with coherent expression patterns. By studying the interaction between gene signals and the geometry of the underlying space, the three methods give multidimensional rankings of the genes and visualisation of relationships between them. MDPI 2022-08-13 /pmc/articles/PMC9407339/ /pubmed/36010781 http://dx.doi.org/10.3390/e24081116 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Hoekzema, Renee S. Marsh, Lewis Sumray, Otto Carroll, Thomas M. Lu, Xin Byrne, Helen M. Harrington, Heather A. Multiscale Methods for Signal Selection in Single-Cell Data |
title | Multiscale Methods for Signal Selection in Single-Cell Data |
title_full | Multiscale Methods for Signal Selection in Single-Cell Data |
title_fullStr | Multiscale Methods for Signal Selection in Single-Cell Data |
title_full_unstemmed | Multiscale Methods for Signal Selection in Single-Cell Data |
title_short | Multiscale Methods for Signal Selection in Single-Cell Data |
title_sort | multiscale methods for signal selection in single-cell data |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9407339/ https://www.ncbi.nlm.nih.gov/pubmed/36010781 http://dx.doi.org/10.3390/e24081116 |
work_keys_str_mv | AT hoekzemarenees multiscalemethodsforsignalselectioninsinglecelldata AT marshlewis multiscalemethodsforsignalselectioninsinglecelldata AT sumrayotto multiscalemethodsforsignalselectioninsinglecelldata AT carrollthomasm multiscalemethodsforsignalselectioninsinglecelldata AT luxin multiscalemethodsforsignalselectioninsinglecelldata AT byrnehelenm multiscalemethodsforsignalselectioninsinglecelldata AT harringtonheathera multiscalemethodsforsignalselectioninsinglecelldata |