Cargando…

Single-cell gene set scoring with nearest neighbor graph smoothed data (gssnng)

SUMMARY: Gene set scoring (or enrichment) is a common dimension reduction task in bioinformatics that can be focused on the differences between groups or at the single sample level. Gene sets can represent biological functions, molecular pathways, cell identities, and more. Gene set scores are conte...

Descripción completa

Detalles Bibliográficos
Autores principales: Gibbs, David L, Strasser, Michael K, Huang, Sui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10599965/
https://www.ncbi.nlm.nih.gov/pubmed/37886712
http://dx.doi.org/10.1093/bioadv/vbad150
_version_ 1785125882954776576
author Gibbs, David L
Strasser, Michael K
Huang, Sui
author_facet Gibbs, David L
Strasser, Michael K
Huang, Sui
author_sort Gibbs, David L
collection PubMed
description SUMMARY: Gene set scoring (or enrichment) is a common dimension reduction task in bioinformatics that can be focused on the differences between groups or at the single sample level. Gene sets can represent biological functions, molecular pathways, cell identities, and more. Gene set scores are context dependent values that are useful for interpreting biological changes following experiments or perturbations. Single sample scoring produces a set of scores, one for each member of a group, which can be analyzed with statistical models that can include additional clinically important factors such as gender or age. However, the sparsity and technical noise of single-cell expression measures create difficulties for these methods, which were originally designed for bulk expression profiling (microarrays, RNAseq). This can be greatly remedied by first applying a smoothing transformation that shares gene measure information within transcriptomic neighborhoods. In this work, we use the nearest neighbor graph of cells for matrix smoothing to produce high quality gene set scores on a per-cell, per-group, level which is useful for visualization and statistical analysis. AVAILABILITY AND IMPLEMENTATION: The gssnng software is available using the python package index (PyPI) and works with Scanpy AnnData objects. It can be installed using “pip install gssnng.” More information and demo notebooks: see https://github.com/IlyaLab/gssnng.
format Online
Article
Text
id pubmed-10599965
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-105999652023-10-26 Single-cell gene set scoring with nearest neighbor graph smoothed data (gssnng) Gibbs, David L Strasser, Michael K Huang, Sui Bioinform Adv Application Note SUMMARY: Gene set scoring (or enrichment) is a common dimension reduction task in bioinformatics that can be focused on the differences between groups or at the single sample level. Gene sets can represent biological functions, molecular pathways, cell identities, and more. Gene set scores are context dependent values that are useful for interpreting biological changes following experiments or perturbations. Single sample scoring produces a set of scores, one for each member of a group, which can be analyzed with statistical models that can include additional clinically important factors such as gender or age. However, the sparsity and technical noise of single-cell expression measures create difficulties for these methods, which were originally designed for bulk expression profiling (microarrays, RNAseq). This can be greatly remedied by first applying a smoothing transformation that shares gene measure information within transcriptomic neighborhoods. In this work, we use the nearest neighbor graph of cells for matrix smoothing to produce high quality gene set scores on a per-cell, per-group, level which is useful for visualization and statistical analysis. AVAILABILITY AND IMPLEMENTATION: The gssnng software is available using the python package index (PyPI) and works with Scanpy AnnData objects. It can be installed using “pip install gssnng.” More information and demo notebooks: see https://github.com/IlyaLab/gssnng. Oxford University Press 2023-10-18 /pmc/articles/PMC10599965/ /pubmed/37886712 http://dx.doi.org/10.1093/bioadv/vbad150 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Application Note
Gibbs, David L
Strasser, Michael K
Huang, Sui
Single-cell gene set scoring with nearest neighbor graph smoothed data (gssnng)
title Single-cell gene set scoring with nearest neighbor graph smoothed data (gssnng)
title_full Single-cell gene set scoring with nearest neighbor graph smoothed data (gssnng)
title_fullStr Single-cell gene set scoring with nearest neighbor graph smoothed data (gssnng)
title_full_unstemmed Single-cell gene set scoring with nearest neighbor graph smoothed data (gssnng)
title_short Single-cell gene set scoring with nearest neighbor graph smoothed data (gssnng)
title_sort single-cell gene set scoring with nearest neighbor graph smoothed data (gssnng)
topic Application Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10599965/
https://www.ncbi.nlm.nih.gov/pubmed/37886712
http://dx.doi.org/10.1093/bioadv/vbad150
work_keys_str_mv AT gibbsdavidl singlecellgenesetscoringwithnearestneighborgraphsmootheddatagssnng
AT strassermichaelk singlecellgenesetscoringwithnearestneighborgraphsmootheddatagssnng
AT huangsui singlecellgenesetscoringwithnearestneighborgraphsmootheddatagssnng