Cargando…

SC-JNMF: single-cell clustering integrating multiple quantification methods based on joint non-negative matrix factorization

Single-cell RNA-sequencing is a rapidly evolving technology that enables us to understand biological processes at unprecedented resolution. Single-cell expression analysis requires a complex data processing pipeline, and the pipeline is divided into two main parts: The quantification part, which con...

Descripción completa

Detalles Bibliográficos
Autores principales: Shiga, Mikio, Seno, Shigeto, Onizuka, Makoto, Matsuda, Hideo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8404576/
https://www.ncbi.nlm.nih.gov/pubmed/34532161
http://dx.doi.org/10.7717/peerj.12087
_version_ 1783746193681022976
author Shiga, Mikio
Seno, Shigeto
Onizuka, Makoto
Matsuda, Hideo
author_facet Shiga, Mikio
Seno, Shigeto
Onizuka, Makoto
Matsuda, Hideo
author_sort Shiga, Mikio
collection PubMed
description Single-cell RNA-sequencing is a rapidly evolving technology that enables us to understand biological processes at unprecedented resolution. Single-cell expression analysis requires a complex data processing pipeline, and the pipeline is divided into two main parts: The quantification part, which converts the sequence information into gene-cell matrix data; the analysis part, which analyzes the matrix data using statistics and/or machine learning techniques. In the analysis part, unsupervised cell clustering plays an important role in identifying cell types and discovering cell diversity and subpopulations. Identified cell clusters are also used for subsequent analysis, such as finding differentially expressed genes and inferring cell trajectories. However, single-cell clustering using gene expression profiles shows different results depending on the quantification methods. Clustering results are greatly affected by the quantification method used in the upstream process. In other words, even if the original RNA-sequence data is the same, gene expression profiles processed by different quantification methods will produce different clusters. In this article, we propose a robust and highly accurate clustering method based on joint non-negative matrix factorization (joint-NMF) by utilizing the information from multiple gene expression profiles quantified using different methods from the same RNA-sequence data. Our joint-NMF can extract common factors among multiple gene expression profiles by applying each NMF under the constraint that one of the factorized matrices is shared among multiple NMFs. The joint-NMF determines more robust and accurate cell clustering results by leveraging multiple quantification methods compared to conventional clustering methods, which use only a single gene expression profile. Additionally, we showed the usefulness of discovering marker genes with the extracted features using our method.
format Online
Article
Text
id pubmed-8404576
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-84045762021-09-15 SC-JNMF: single-cell clustering integrating multiple quantification methods based on joint non-negative matrix factorization Shiga, Mikio Seno, Shigeto Onizuka, Makoto Matsuda, Hideo PeerJ Bioinformatics Single-cell RNA-sequencing is a rapidly evolving technology that enables us to understand biological processes at unprecedented resolution. Single-cell expression analysis requires a complex data processing pipeline, and the pipeline is divided into two main parts: The quantification part, which converts the sequence information into gene-cell matrix data; the analysis part, which analyzes the matrix data using statistics and/or machine learning techniques. In the analysis part, unsupervised cell clustering plays an important role in identifying cell types and discovering cell diversity and subpopulations. Identified cell clusters are also used for subsequent analysis, such as finding differentially expressed genes and inferring cell trajectories. However, single-cell clustering using gene expression profiles shows different results depending on the quantification methods. Clustering results are greatly affected by the quantification method used in the upstream process. In other words, even if the original RNA-sequence data is the same, gene expression profiles processed by different quantification methods will produce different clusters. In this article, we propose a robust and highly accurate clustering method based on joint non-negative matrix factorization (joint-NMF) by utilizing the information from multiple gene expression profiles quantified using different methods from the same RNA-sequence data. Our joint-NMF can extract common factors among multiple gene expression profiles by applying each NMF under the constraint that one of the factorized matrices is shared among multiple NMFs. The joint-NMF determines more robust and accurate cell clustering results by leveraging multiple quantification methods compared to conventional clustering methods, which use only a single gene expression profile. Additionally, we showed the usefulness of discovering marker genes with the extracted features using our method. PeerJ Inc. 2021-08-27 /pmc/articles/PMC8404576/ /pubmed/34532161 http://dx.doi.org/10.7717/peerj.12087 Text en ©2021 Shiga et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Shiga, Mikio
Seno, Shigeto
Onizuka, Makoto
Matsuda, Hideo
SC-JNMF: single-cell clustering integrating multiple quantification methods based on joint non-negative matrix factorization
title SC-JNMF: single-cell clustering integrating multiple quantification methods based on joint non-negative matrix factorization
title_full SC-JNMF: single-cell clustering integrating multiple quantification methods based on joint non-negative matrix factorization
title_fullStr SC-JNMF: single-cell clustering integrating multiple quantification methods based on joint non-negative matrix factorization
title_full_unstemmed SC-JNMF: single-cell clustering integrating multiple quantification methods based on joint non-negative matrix factorization
title_short SC-JNMF: single-cell clustering integrating multiple quantification methods based on joint non-negative matrix factorization
title_sort sc-jnmf: single-cell clustering integrating multiple quantification methods based on joint non-negative matrix factorization
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8404576/
https://www.ncbi.nlm.nih.gov/pubmed/34532161
http://dx.doi.org/10.7717/peerj.12087
work_keys_str_mv AT shigamikio scjnmfsinglecellclusteringintegratingmultiplequantificationmethodsbasedonjointnonnegativematrixfactorization
AT senoshigeto scjnmfsinglecellclusteringintegratingmultiplequantificationmethodsbasedonjointnonnegativematrixfactorization
AT onizukamakoto scjnmfsinglecellclusteringintegratingmultiplequantificationmethodsbasedonjointnonnegativematrixfactorization
AT matsudahideo scjnmfsinglecellclusteringintegratingmultiplequantificationmethodsbasedonjointnonnegativematrixfactorization