Cargando…

Detecting heterogeneity in single-cell RNA-Seq data by non-negative matrix factorization

Single-cell RNA-Sequencing (scRNA-Seq) is a fast-evolving technology that enables the understanding of biological processes at an unprecedentedly high resolution. However, well-suited bioinformatics tools to analyze the data generated from this new technology are still lacking. Here we investigate t...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhu, Xun, Ching, Travers, Pan, Xinghua, Weissman, Sherman M., Garmire, Lana
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5251935/
https://www.ncbi.nlm.nih.gov/pubmed/28133571
http://dx.doi.org/10.7717/peerj.2888
_version_ 1782497906285084672
author Zhu, Xun
Ching, Travers
Pan, Xinghua
Weissman, Sherman M.
Garmire, Lana
author_facet Zhu, Xun
Ching, Travers
Pan, Xinghua
Weissman, Sherman M.
Garmire, Lana
author_sort Zhu, Xun
collection PubMed
description Single-cell RNA-Sequencing (scRNA-Seq) is a fast-evolving technology that enables the understanding of biological processes at an unprecedentedly high resolution. However, well-suited bioinformatics tools to analyze the data generated from this new technology are still lacking. Here we investigate the performance of non-negative matrix factorization (NMF) method to analyze a wide variety of scRNA-Seq datasets, ranging from mouse hematopoietic stem cells to human glioblastoma data. In comparison to other unsupervised clustering methods including K-means and hierarchical clustering, NMF has higher accuracy in separating similar groups in various datasets. We ranked genes by their importance scores (D-scores) in separating these groups, and discovered that NMF uniquely identifies genes expressed at intermediate levels as top-ranked genes. Finally, we show that in conjugation with the modularity detection method FEM, NMF reveals meaningful protein-protein interaction modules. In summary, we propose that NMF is a desirable method to analyze heterogeneous single-cell RNA-Seq data. The NMF based subpopulation detection package is available at: https://github.com/lanagarmire/NMFEM.
format Online
Article
Text
id pubmed-5251935
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-52519352017-01-27 Detecting heterogeneity in single-cell RNA-Seq data by non-negative matrix factorization Zhu, Xun Ching, Travers Pan, Xinghua Weissman, Sherman M. Garmire, Lana PeerJ Bioinformatics Single-cell RNA-Sequencing (scRNA-Seq) is a fast-evolving technology that enables the understanding of biological processes at an unprecedentedly high resolution. However, well-suited bioinformatics tools to analyze the data generated from this new technology are still lacking. Here we investigate the performance of non-negative matrix factorization (NMF) method to analyze a wide variety of scRNA-Seq datasets, ranging from mouse hematopoietic stem cells to human glioblastoma data. In comparison to other unsupervised clustering methods including K-means and hierarchical clustering, NMF has higher accuracy in separating similar groups in various datasets. We ranked genes by their importance scores (D-scores) in separating these groups, and discovered that NMF uniquely identifies genes expressed at intermediate levels as top-ranked genes. Finally, we show that in conjugation with the modularity detection method FEM, NMF reveals meaningful protein-protein interaction modules. In summary, we propose that NMF is a desirable method to analyze heterogeneous single-cell RNA-Seq data. The NMF based subpopulation detection package is available at: https://github.com/lanagarmire/NMFEM. PeerJ Inc. 2017-01-19 /pmc/articles/PMC5251935/ /pubmed/28133571 http://dx.doi.org/10.7717/peerj.2888 Text en ©2017 Zhu et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Zhu, Xun
Ching, Travers
Pan, Xinghua
Weissman, Sherman M.
Garmire, Lana
Detecting heterogeneity in single-cell RNA-Seq data by non-negative matrix factorization
title Detecting heterogeneity in single-cell RNA-Seq data by non-negative matrix factorization
title_full Detecting heterogeneity in single-cell RNA-Seq data by non-negative matrix factorization
title_fullStr Detecting heterogeneity in single-cell RNA-Seq data by non-negative matrix factorization
title_full_unstemmed Detecting heterogeneity in single-cell RNA-Seq data by non-negative matrix factorization
title_short Detecting heterogeneity in single-cell RNA-Seq data by non-negative matrix factorization
title_sort detecting heterogeneity in single-cell rna-seq data by non-negative matrix factorization
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5251935/
https://www.ncbi.nlm.nih.gov/pubmed/28133571
http://dx.doi.org/10.7717/peerj.2888
work_keys_str_mv AT zhuxun detectingheterogeneityinsinglecellrnaseqdatabynonnegativematrixfactorization
AT chingtravers detectingheterogeneityinsinglecellrnaseqdatabynonnegativematrixfactorization
AT panxinghua detectingheterogeneityinsinglecellrnaseqdatabynonnegativematrixfactorization
AT weissmanshermanm detectingheterogeneityinsinglecellrnaseqdatabynonnegativematrixfactorization
AT garmirelana detectingheterogeneityinsinglecellrnaseqdatabynonnegativematrixfactorization