Cargando…
Detecting heterogeneity in single-cell RNA-Seq data by non-negative matrix factorization
Single-cell RNA-Sequencing (scRNA-Seq) is a fast-evolving technology that enables the understanding of biological processes at an unprecedentedly high resolution. However, well-suited bioinformatics tools to analyze the data generated from this new technology are still lacking. Here we investigate t...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
PeerJ Inc.
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5251935/ https://www.ncbi.nlm.nih.gov/pubmed/28133571 http://dx.doi.org/10.7717/peerj.2888 |
_version_ | 1782497906285084672 |
---|---|
author | Zhu, Xun Ching, Travers Pan, Xinghua Weissman, Sherman M. Garmire, Lana |
author_facet | Zhu, Xun Ching, Travers Pan, Xinghua Weissman, Sherman M. Garmire, Lana |
author_sort | Zhu, Xun |
collection | PubMed |
description | Single-cell RNA-Sequencing (scRNA-Seq) is a fast-evolving technology that enables the understanding of biological processes at an unprecedentedly high resolution. However, well-suited bioinformatics tools to analyze the data generated from this new technology are still lacking. Here we investigate the performance of non-negative matrix factorization (NMF) method to analyze a wide variety of scRNA-Seq datasets, ranging from mouse hematopoietic stem cells to human glioblastoma data. In comparison to other unsupervised clustering methods including K-means and hierarchical clustering, NMF has higher accuracy in separating similar groups in various datasets. We ranked genes by their importance scores (D-scores) in separating these groups, and discovered that NMF uniquely identifies genes expressed at intermediate levels as top-ranked genes. Finally, we show that in conjugation with the modularity detection method FEM, NMF reveals meaningful protein-protein interaction modules. In summary, we propose that NMF is a desirable method to analyze heterogeneous single-cell RNA-Seq data. The NMF based subpopulation detection package is available at: https://github.com/lanagarmire/NMFEM. |
format | Online Article Text |
id | pubmed-5251935 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | PeerJ Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-52519352017-01-27 Detecting heterogeneity in single-cell RNA-Seq data by non-negative matrix factorization Zhu, Xun Ching, Travers Pan, Xinghua Weissman, Sherman M. Garmire, Lana PeerJ Bioinformatics Single-cell RNA-Sequencing (scRNA-Seq) is a fast-evolving technology that enables the understanding of biological processes at an unprecedentedly high resolution. However, well-suited bioinformatics tools to analyze the data generated from this new technology are still lacking. Here we investigate the performance of non-negative matrix factorization (NMF) method to analyze a wide variety of scRNA-Seq datasets, ranging from mouse hematopoietic stem cells to human glioblastoma data. In comparison to other unsupervised clustering methods including K-means and hierarchical clustering, NMF has higher accuracy in separating similar groups in various datasets. We ranked genes by their importance scores (D-scores) in separating these groups, and discovered that NMF uniquely identifies genes expressed at intermediate levels as top-ranked genes. Finally, we show that in conjugation with the modularity detection method FEM, NMF reveals meaningful protein-protein interaction modules. In summary, we propose that NMF is a desirable method to analyze heterogeneous single-cell RNA-Seq data. The NMF based subpopulation detection package is available at: https://github.com/lanagarmire/NMFEM. PeerJ Inc. 2017-01-19 /pmc/articles/PMC5251935/ /pubmed/28133571 http://dx.doi.org/10.7717/peerj.2888 Text en ©2017 Zhu et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited. |
spellingShingle | Bioinformatics Zhu, Xun Ching, Travers Pan, Xinghua Weissman, Sherman M. Garmire, Lana Detecting heterogeneity in single-cell RNA-Seq data by non-negative matrix factorization |
title | Detecting heterogeneity in single-cell RNA-Seq data by non-negative matrix factorization |
title_full | Detecting heterogeneity in single-cell RNA-Seq data by non-negative matrix factorization |
title_fullStr | Detecting heterogeneity in single-cell RNA-Seq data by non-negative matrix factorization |
title_full_unstemmed | Detecting heterogeneity in single-cell RNA-Seq data by non-negative matrix factorization |
title_short | Detecting heterogeneity in single-cell RNA-Seq data by non-negative matrix factorization |
title_sort | detecting heterogeneity in single-cell rna-seq data by non-negative matrix factorization |
topic | Bioinformatics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5251935/ https://www.ncbi.nlm.nih.gov/pubmed/28133571 http://dx.doi.org/10.7717/peerj.2888 |
work_keys_str_mv | AT zhuxun detectingheterogeneityinsinglecellrnaseqdatabynonnegativematrixfactorization AT chingtravers detectingheterogeneityinsinglecellrnaseqdatabynonnegativematrixfactorization AT panxinghua detectingheterogeneityinsinglecellrnaseqdatabynonnegativematrixfactorization AT weissmanshermanm detectingheterogeneityinsinglecellrnaseqdatabynonnegativematrixfactorization AT garmirelana detectingheterogeneityinsinglecellrnaseqdatabynonnegativematrixfactorization |