Cargando…

scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling

MOTIVATION: Single-cell RNA sequencing (scRNA-seq) captures whole transcriptome information of individual cells. While scRNA-seq measures thousands of genes, researchers are often interested in only dozens to hundreds of genes for a closer study. Then, a question is how to select those informative g...

Descripción completa

Detalles Bibliográficos
Autores principales: Song, Dongyuan, Li, Kexin, Hemminger, Zachary, Wollman, Roy, Li, Jingyi Jessica
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8275345/
https://www.ncbi.nlm.nih.gov/pubmed/34252925
http://dx.doi.org/10.1093/bioinformatics/btab273
_version_ 1783721694756601856
author Song, Dongyuan
Li, Kexin
Hemminger, Zachary
Wollman, Roy
Li, Jingyi Jessica
author_facet Song, Dongyuan
Li, Kexin
Hemminger, Zachary
Wollman, Roy
Li, Jingyi Jessica
author_sort Song, Dongyuan
collection PubMed
description MOTIVATION: Single-cell RNA sequencing (scRNA-seq) captures whole transcriptome information of individual cells. While scRNA-seq measures thousands of genes, researchers are often interested in only dozens to hundreds of genes for a closer study. Then, a question is how to select those informative genes from scRNA-seq data. Moreover, single-cell targeted gene profiling technologies are gaining popularity for their low costs, high sensitivity and extra (e.g. spatial) information; however, they typically can only measure up to a few hundred genes. Then another challenging question is how to select genes for targeted gene profiling based on existing scRNA-seq data. RESULTS: Here, we develop the single-cell Projective Non-negative Matrix Factorization (scPNMF) method to select informative genes from scRNA-seq data in an unsupervised way. Compared with existing gene selection methods, scPNMF has two advantages. First, its selected informative genes can better distinguish cell types. Second, it enables the alignment of new targeted gene profiling data with reference data in a low-dimensional space to facilitate the prediction of cell types in the new data. Technically, scPNMF modifies the PNMF algorithm for gene selection by changing the initialization and adding a basis selection step, which selects informative bases to distinguish cell types. We demonstrate that scPNMF outperforms the state-of-the-art gene selection methods on diverse scRNA-seq datasets. Moreover, we show that scPNMF can guide the design of targeted gene profiling experiments and the cell-type annotation on targeted gene profiling data. AVAILABILITY AND IMPLEMENTATION: The R package is open-access and available at https://github.com/JSB-UCLA/scPNMF. The data used in this work are available at Zenodo: https://doi.org/10.5281/zenodo.4797997. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-8275345
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-82753452021-07-13 scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling Song, Dongyuan Li, Kexin Hemminger, Zachary Wollman, Roy Li, Jingyi Jessica Bioinformatics Regulatory and Functional Genomics MOTIVATION: Single-cell RNA sequencing (scRNA-seq) captures whole transcriptome information of individual cells. While scRNA-seq measures thousands of genes, researchers are often interested in only dozens to hundreds of genes for a closer study. Then, a question is how to select those informative genes from scRNA-seq data. Moreover, single-cell targeted gene profiling technologies are gaining popularity for their low costs, high sensitivity and extra (e.g. spatial) information; however, they typically can only measure up to a few hundred genes. Then another challenging question is how to select genes for targeted gene profiling based on existing scRNA-seq data. RESULTS: Here, we develop the single-cell Projective Non-negative Matrix Factorization (scPNMF) method to select informative genes from scRNA-seq data in an unsupervised way. Compared with existing gene selection methods, scPNMF has two advantages. First, its selected informative genes can better distinguish cell types. Second, it enables the alignment of new targeted gene profiling data with reference data in a low-dimensional space to facilitate the prediction of cell types in the new data. Technically, scPNMF modifies the PNMF algorithm for gene selection by changing the initialization and adding a basis selection step, which selects informative bases to distinguish cell types. We demonstrate that scPNMF outperforms the state-of-the-art gene selection methods on diverse scRNA-seq datasets. Moreover, we show that scPNMF can guide the design of targeted gene profiling experiments and the cell-type annotation on targeted gene profiling data. AVAILABILITY AND IMPLEMENTATION: The R package is open-access and available at https://github.com/JSB-UCLA/scPNMF. The data used in this work are available at Zenodo: https://doi.org/10.5281/zenodo.4797997. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2021-07-12 /pmc/articles/PMC8275345/ /pubmed/34252925 http://dx.doi.org/10.1093/bioinformatics/btab273 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Regulatory and Functional Genomics
Song, Dongyuan
Li, Kexin
Hemminger, Zachary
Wollman, Roy
Li, Jingyi Jessica
scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling
title scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling
title_full scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling
title_fullStr scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling
title_full_unstemmed scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling
title_short scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling
title_sort scpnmf: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling
topic Regulatory and Functional Genomics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8275345/
https://www.ncbi.nlm.nih.gov/pubmed/34252925
http://dx.doi.org/10.1093/bioinformatics/btab273
work_keys_str_mv AT songdongyuan scpnmfsparsegeneencodingofsinglecellstofacilitategeneselectionfortargetedgeneprofiling
AT likexin scpnmfsparsegeneencodingofsinglecellstofacilitategeneselectionfortargetedgeneprofiling
AT hemmingerzachary scpnmfsparsegeneencodingofsinglecellstofacilitategeneselectionfortargetedgeneprofiling
AT wollmanroy scpnmfsparsegeneencodingofsinglecellstofacilitategeneselectionfortargetedgeneprofiling
AT lijingyijessica scpnmfsparsegeneencodingofsinglecellstofacilitategeneselectionfortargetedgeneprofiling