Cargando…
SC-JNMF: single-cell clustering integrating multiple quantification methods based on joint non-negative matrix factorization
Single-cell RNA-sequencing is a rapidly evolving technology that enables us to understand biological processes at unprecedented resolution. Single-cell expression analysis requires a complex data processing pipeline, and the pipeline is divided into two main parts: The quantification part, which con...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
PeerJ Inc.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8404576/ https://www.ncbi.nlm.nih.gov/pubmed/34532161 http://dx.doi.org/10.7717/peerj.12087 |
_version_ | 1783746193681022976 |
---|---|
author | Shiga, Mikio Seno, Shigeto Onizuka, Makoto Matsuda, Hideo |
author_facet | Shiga, Mikio Seno, Shigeto Onizuka, Makoto Matsuda, Hideo |
author_sort | Shiga, Mikio |
collection | PubMed |
description | Single-cell RNA-sequencing is a rapidly evolving technology that enables us to understand biological processes at unprecedented resolution. Single-cell expression analysis requires a complex data processing pipeline, and the pipeline is divided into two main parts: The quantification part, which converts the sequence information into gene-cell matrix data; the analysis part, which analyzes the matrix data using statistics and/or machine learning techniques. In the analysis part, unsupervised cell clustering plays an important role in identifying cell types and discovering cell diversity and subpopulations. Identified cell clusters are also used for subsequent analysis, such as finding differentially expressed genes and inferring cell trajectories. However, single-cell clustering using gene expression profiles shows different results depending on the quantification methods. Clustering results are greatly affected by the quantification method used in the upstream process. In other words, even if the original RNA-sequence data is the same, gene expression profiles processed by different quantification methods will produce different clusters. In this article, we propose a robust and highly accurate clustering method based on joint non-negative matrix factorization (joint-NMF) by utilizing the information from multiple gene expression profiles quantified using different methods from the same RNA-sequence data. Our joint-NMF can extract common factors among multiple gene expression profiles by applying each NMF under the constraint that one of the factorized matrices is shared among multiple NMFs. The joint-NMF determines more robust and accurate cell clustering results by leveraging multiple quantification methods compared to conventional clustering methods, which use only a single gene expression profile. Additionally, we showed the usefulness of discovering marker genes with the extracted features using our method. |
format | Online Article Text |
id | pubmed-8404576 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | PeerJ Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-84045762021-09-15 SC-JNMF: single-cell clustering integrating multiple quantification methods based on joint non-negative matrix factorization Shiga, Mikio Seno, Shigeto Onizuka, Makoto Matsuda, Hideo PeerJ Bioinformatics Single-cell RNA-sequencing is a rapidly evolving technology that enables us to understand biological processes at unprecedented resolution. Single-cell expression analysis requires a complex data processing pipeline, and the pipeline is divided into two main parts: The quantification part, which converts the sequence information into gene-cell matrix data; the analysis part, which analyzes the matrix data using statistics and/or machine learning techniques. In the analysis part, unsupervised cell clustering plays an important role in identifying cell types and discovering cell diversity and subpopulations. Identified cell clusters are also used for subsequent analysis, such as finding differentially expressed genes and inferring cell trajectories. However, single-cell clustering using gene expression profiles shows different results depending on the quantification methods. Clustering results are greatly affected by the quantification method used in the upstream process. In other words, even if the original RNA-sequence data is the same, gene expression profiles processed by different quantification methods will produce different clusters. In this article, we propose a robust and highly accurate clustering method based on joint non-negative matrix factorization (joint-NMF) by utilizing the information from multiple gene expression profiles quantified using different methods from the same RNA-sequence data. Our joint-NMF can extract common factors among multiple gene expression profiles by applying each NMF under the constraint that one of the factorized matrices is shared among multiple NMFs. The joint-NMF determines more robust and accurate cell clustering results by leveraging multiple quantification methods compared to conventional clustering methods, which use only a single gene expression profile. Additionally, we showed the usefulness of discovering marker genes with the extracted features using our method. PeerJ Inc. 2021-08-27 /pmc/articles/PMC8404576/ /pubmed/34532161 http://dx.doi.org/10.7717/peerj.12087 Text en ©2021 Shiga et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited. |
spellingShingle | Bioinformatics Shiga, Mikio Seno, Shigeto Onizuka, Makoto Matsuda, Hideo SC-JNMF: single-cell clustering integrating multiple quantification methods based on joint non-negative matrix factorization |
title | SC-JNMF: single-cell clustering integrating multiple quantification methods based on joint non-negative matrix factorization |
title_full | SC-JNMF: single-cell clustering integrating multiple quantification methods based on joint non-negative matrix factorization |
title_fullStr | SC-JNMF: single-cell clustering integrating multiple quantification methods based on joint non-negative matrix factorization |
title_full_unstemmed | SC-JNMF: single-cell clustering integrating multiple quantification methods based on joint non-negative matrix factorization |
title_short | SC-JNMF: single-cell clustering integrating multiple quantification methods based on joint non-negative matrix factorization |
title_sort | sc-jnmf: single-cell clustering integrating multiple quantification methods based on joint non-negative matrix factorization |
topic | Bioinformatics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8404576/ https://www.ncbi.nlm.nih.gov/pubmed/34532161 http://dx.doi.org/10.7717/peerj.12087 |
work_keys_str_mv | AT shigamikio scjnmfsinglecellclusteringintegratingmultiplequantificationmethodsbasedonjointnonnegativematrixfactorization AT senoshigeto scjnmfsinglecellclusteringintegratingmultiplequantificationmethodsbasedonjointnonnegativematrixfactorization AT onizukamakoto scjnmfsinglecellclusteringintegratingmultiplequantificationmethodsbasedonjointnonnegativematrixfactorization AT matsudahideo scjnmfsinglecellclusteringintegratingmultiplequantificationmethodsbasedonjointnonnegativematrixfactorization |