Cargando…
scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling
MOTIVATION: Single-cell RNA sequencing (scRNA-seq) captures whole transcriptome information of individual cells. While scRNA-seq measures thousands of genes, researchers are often interested in only dozens to hundreds of genes for a closer study. Then, a question is how to select those informative g...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8275345/ https://www.ncbi.nlm.nih.gov/pubmed/34252925 http://dx.doi.org/10.1093/bioinformatics/btab273 |
_version_ | 1783721694756601856 |
---|---|
author | Song, Dongyuan Li, Kexin Hemminger, Zachary Wollman, Roy Li, Jingyi Jessica |
author_facet | Song, Dongyuan Li, Kexin Hemminger, Zachary Wollman, Roy Li, Jingyi Jessica |
author_sort | Song, Dongyuan |
collection | PubMed |
description | MOTIVATION: Single-cell RNA sequencing (scRNA-seq) captures whole transcriptome information of individual cells. While scRNA-seq measures thousands of genes, researchers are often interested in only dozens to hundreds of genes for a closer study. Then, a question is how to select those informative genes from scRNA-seq data. Moreover, single-cell targeted gene profiling technologies are gaining popularity for their low costs, high sensitivity and extra (e.g. spatial) information; however, they typically can only measure up to a few hundred genes. Then another challenging question is how to select genes for targeted gene profiling based on existing scRNA-seq data. RESULTS: Here, we develop the single-cell Projective Non-negative Matrix Factorization (scPNMF) method to select informative genes from scRNA-seq data in an unsupervised way. Compared with existing gene selection methods, scPNMF has two advantages. First, its selected informative genes can better distinguish cell types. Second, it enables the alignment of new targeted gene profiling data with reference data in a low-dimensional space to facilitate the prediction of cell types in the new data. Technically, scPNMF modifies the PNMF algorithm for gene selection by changing the initialization and adding a basis selection step, which selects informative bases to distinguish cell types. We demonstrate that scPNMF outperforms the state-of-the-art gene selection methods on diverse scRNA-seq datasets. Moreover, we show that scPNMF can guide the design of targeted gene profiling experiments and the cell-type annotation on targeted gene profiling data. AVAILABILITY AND IMPLEMENTATION: The R package is open-access and available at https://github.com/JSB-UCLA/scPNMF. The data used in this work are available at Zenodo: https://doi.org/10.5281/zenodo.4797997. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-8275345 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-82753452021-07-13 scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling Song, Dongyuan Li, Kexin Hemminger, Zachary Wollman, Roy Li, Jingyi Jessica Bioinformatics Regulatory and Functional Genomics MOTIVATION: Single-cell RNA sequencing (scRNA-seq) captures whole transcriptome information of individual cells. While scRNA-seq measures thousands of genes, researchers are often interested in only dozens to hundreds of genes for a closer study. Then, a question is how to select those informative genes from scRNA-seq data. Moreover, single-cell targeted gene profiling technologies are gaining popularity for their low costs, high sensitivity and extra (e.g. spatial) information; however, they typically can only measure up to a few hundred genes. Then another challenging question is how to select genes for targeted gene profiling based on existing scRNA-seq data. RESULTS: Here, we develop the single-cell Projective Non-negative Matrix Factorization (scPNMF) method to select informative genes from scRNA-seq data in an unsupervised way. Compared with existing gene selection methods, scPNMF has two advantages. First, its selected informative genes can better distinguish cell types. Second, it enables the alignment of new targeted gene profiling data with reference data in a low-dimensional space to facilitate the prediction of cell types in the new data. Technically, scPNMF modifies the PNMF algorithm for gene selection by changing the initialization and adding a basis selection step, which selects informative bases to distinguish cell types. We demonstrate that scPNMF outperforms the state-of-the-art gene selection methods on diverse scRNA-seq datasets. Moreover, we show that scPNMF can guide the design of targeted gene profiling experiments and the cell-type annotation on targeted gene profiling data. AVAILABILITY AND IMPLEMENTATION: The R package is open-access and available at https://github.com/JSB-UCLA/scPNMF. The data used in this work are available at Zenodo: https://doi.org/10.5281/zenodo.4797997. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2021-07-12 /pmc/articles/PMC8275345/ /pubmed/34252925 http://dx.doi.org/10.1093/bioinformatics/btab273 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Regulatory and Functional Genomics Song, Dongyuan Li, Kexin Hemminger, Zachary Wollman, Roy Li, Jingyi Jessica scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling |
title | scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling |
title_full | scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling |
title_fullStr | scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling |
title_full_unstemmed | scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling |
title_short | scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling |
title_sort | scpnmf: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling |
topic | Regulatory and Functional Genomics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8275345/ https://www.ncbi.nlm.nih.gov/pubmed/34252925 http://dx.doi.org/10.1093/bioinformatics/btab273 |
work_keys_str_mv | AT songdongyuan scpnmfsparsegeneencodingofsinglecellstofacilitategeneselectionfortargetedgeneprofiling AT likexin scpnmfsparsegeneencodingofsinglecellstofacilitategeneselectionfortargetedgeneprofiling AT hemmingerzachary scpnmfsparsegeneencodingofsinglecellstofacilitategeneselectionfortargetedgeneprofiling AT wollmanroy scpnmfsparsegeneencodingofsinglecellstofacilitategeneselectionfortargetedgeneprofiling AT lijingyijessica scpnmfsparsegeneencodingofsinglecellstofacilitategeneselectionfortargetedgeneprofiling |