Cargando…

Multi-cancer samples clustering via graph regularized low-rank representation method under sparse and symmetric constraints

BACKGROUND: Identifying different types of cancer based on gene expression data has become hotspot in bioinformatics research. Clustering cancer gene expression data from multiple cancers to their own class is a significance solution. However, the characteristics of high-dimensional and small sample...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Juan, Lu, Cong-Hai, Liu, Jin-Xing, Dai, Ling-Yun, Kong, Xiang-Zhen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6936083/
https://www.ncbi.nlm.nih.gov/pubmed/31888442
http://dx.doi.org/10.1186/s12859-019-3231-5
_version_ 1783483680405061632
author Wang, Juan
Lu, Cong-Hai
Liu, Jin-Xing
Dai, Ling-Yun
Kong, Xiang-Zhen
author_facet Wang, Juan
Lu, Cong-Hai
Liu, Jin-Xing
Dai, Ling-Yun
Kong, Xiang-Zhen
author_sort Wang, Juan
collection PubMed
description BACKGROUND: Identifying different types of cancer based on gene expression data has become hotspot in bioinformatics research. Clustering cancer gene expression data from multiple cancers to their own class is a significance solution. However, the characteristics of high-dimensional and small samples of gene expression data and the noise of the data make data mining and research difficult. Although there are many effective and feasible methods to deal with this problem, the possibility remains that these methods are flawed. RESULTS: In this paper, we propose the graph regularized low-rank representation under symmetric and sparse constraints (sgLRR) method in which we introduce graph regularization based on manifold learning and symmetric sparse constraints into the traditional low-rank representation (LRR). For the sgLRR method, by means of symmetric constraint and sparse constraint, the effect of raw data noise on low-rank representation is alleviated. Further, sgLRR method preserves the important intrinsic local geometrical structures of the raw data by introducing graph regularization. We apply this method to cluster multi-cancer samples based on gene expression data, which improves the clustering quality. First, the gene expression data are decomposed by sgLRR method. And, a lowest rank representation matrix is obtained, which is symmetric and sparse. Then, an affinity matrix is constructed to perform the multi-cancer sample clustering by using a spectral clustering algorithm, i.e., normalized cuts (Ncuts). Finally, the multi-cancer samples clustering is completed. CONCLUSIONS: A series of comparative experiments demonstrate that the sgLRR method based on low rank representation has a great advantage and remarkable performance in the clustering of multi-cancer samples.
format Online
Article
Text
id pubmed-6936083
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-69360832019-12-31 Multi-cancer samples clustering via graph regularized low-rank representation method under sparse and symmetric constraints Wang, Juan Lu, Cong-Hai Liu, Jin-Xing Dai, Ling-Yun Kong, Xiang-Zhen BMC Bioinformatics Research BACKGROUND: Identifying different types of cancer based on gene expression data has become hotspot in bioinformatics research. Clustering cancer gene expression data from multiple cancers to their own class is a significance solution. However, the characteristics of high-dimensional and small samples of gene expression data and the noise of the data make data mining and research difficult. Although there are many effective and feasible methods to deal with this problem, the possibility remains that these methods are flawed. RESULTS: In this paper, we propose the graph regularized low-rank representation under symmetric and sparse constraints (sgLRR) method in which we introduce graph regularization based on manifold learning and symmetric sparse constraints into the traditional low-rank representation (LRR). For the sgLRR method, by means of symmetric constraint and sparse constraint, the effect of raw data noise on low-rank representation is alleviated. Further, sgLRR method preserves the important intrinsic local geometrical structures of the raw data by introducing graph regularization. We apply this method to cluster multi-cancer samples based on gene expression data, which improves the clustering quality. First, the gene expression data are decomposed by sgLRR method. And, a lowest rank representation matrix is obtained, which is symmetric and sparse. Then, an affinity matrix is constructed to perform the multi-cancer sample clustering by using a spectral clustering algorithm, i.e., normalized cuts (Ncuts). Finally, the multi-cancer samples clustering is completed. CONCLUSIONS: A series of comparative experiments demonstrate that the sgLRR method based on low rank representation has a great advantage and remarkable performance in the clustering of multi-cancer samples. BioMed Central 2019-12-30 /pmc/articles/PMC6936083/ /pubmed/31888442 http://dx.doi.org/10.1186/s12859-019-3231-5 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Wang, Juan
Lu, Cong-Hai
Liu, Jin-Xing
Dai, Ling-Yun
Kong, Xiang-Zhen
Multi-cancer samples clustering via graph regularized low-rank representation method under sparse and symmetric constraints
title Multi-cancer samples clustering via graph regularized low-rank representation method under sparse and symmetric constraints
title_full Multi-cancer samples clustering via graph regularized low-rank representation method under sparse and symmetric constraints
title_fullStr Multi-cancer samples clustering via graph regularized low-rank representation method under sparse and symmetric constraints
title_full_unstemmed Multi-cancer samples clustering via graph regularized low-rank representation method under sparse and symmetric constraints
title_short Multi-cancer samples clustering via graph regularized low-rank representation method under sparse and symmetric constraints
title_sort multi-cancer samples clustering via graph regularized low-rank representation method under sparse and symmetric constraints
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6936083/
https://www.ncbi.nlm.nih.gov/pubmed/31888442
http://dx.doi.org/10.1186/s12859-019-3231-5
work_keys_str_mv AT wangjuan multicancersamplesclusteringviagraphregularizedlowrankrepresentationmethodundersparseandsymmetricconstraints
AT luconghai multicancersamplesclusteringviagraphregularizedlowrankrepresentationmethodundersparseandsymmetricconstraints
AT liujinxing multicancersamplesclusteringviagraphregularizedlowrankrepresentationmethodundersparseandsymmetricconstraints
AT dailingyun multicancersamplesclusteringviagraphregularizedlowrankrepresentationmethodundersparseandsymmetricconstraints
AT kongxiangzhen multicancersamplesclusteringviagraphregularizedlowrankrepresentationmethodundersparseandsymmetricconstraints