Cargando…
Multi-cancer samples clustering via graph regularized low-rank representation method under sparse and symmetric constraints
BACKGROUND: Identifying different types of cancer based on gene expression data has become hotspot in bioinformatics research. Clustering cancer gene expression data from multiple cancers to their own class is a significance solution. However, the characteristics of high-dimensional and small sample...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6936083/ https://www.ncbi.nlm.nih.gov/pubmed/31888442 http://dx.doi.org/10.1186/s12859-019-3231-5 |
_version_ | 1783483680405061632 |
---|---|
author | Wang, Juan Lu, Cong-Hai Liu, Jin-Xing Dai, Ling-Yun Kong, Xiang-Zhen |
author_facet | Wang, Juan Lu, Cong-Hai Liu, Jin-Xing Dai, Ling-Yun Kong, Xiang-Zhen |
author_sort | Wang, Juan |
collection | PubMed |
description | BACKGROUND: Identifying different types of cancer based on gene expression data has become hotspot in bioinformatics research. Clustering cancer gene expression data from multiple cancers to their own class is a significance solution. However, the characteristics of high-dimensional and small samples of gene expression data and the noise of the data make data mining and research difficult. Although there are many effective and feasible methods to deal with this problem, the possibility remains that these methods are flawed. RESULTS: In this paper, we propose the graph regularized low-rank representation under symmetric and sparse constraints (sgLRR) method in which we introduce graph regularization based on manifold learning and symmetric sparse constraints into the traditional low-rank representation (LRR). For the sgLRR method, by means of symmetric constraint and sparse constraint, the effect of raw data noise on low-rank representation is alleviated. Further, sgLRR method preserves the important intrinsic local geometrical structures of the raw data by introducing graph regularization. We apply this method to cluster multi-cancer samples based on gene expression data, which improves the clustering quality. First, the gene expression data are decomposed by sgLRR method. And, a lowest rank representation matrix is obtained, which is symmetric and sparse. Then, an affinity matrix is constructed to perform the multi-cancer sample clustering by using a spectral clustering algorithm, i.e., normalized cuts (Ncuts). Finally, the multi-cancer samples clustering is completed. CONCLUSIONS: A series of comparative experiments demonstrate that the sgLRR method based on low rank representation has a great advantage and remarkable performance in the clustering of multi-cancer samples. |
format | Online Article Text |
id | pubmed-6936083 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-69360832019-12-31 Multi-cancer samples clustering via graph regularized low-rank representation method under sparse and symmetric constraints Wang, Juan Lu, Cong-Hai Liu, Jin-Xing Dai, Ling-Yun Kong, Xiang-Zhen BMC Bioinformatics Research BACKGROUND: Identifying different types of cancer based on gene expression data has become hotspot in bioinformatics research. Clustering cancer gene expression data from multiple cancers to their own class is a significance solution. However, the characteristics of high-dimensional and small samples of gene expression data and the noise of the data make data mining and research difficult. Although there are many effective and feasible methods to deal with this problem, the possibility remains that these methods are flawed. RESULTS: In this paper, we propose the graph regularized low-rank representation under symmetric and sparse constraints (sgLRR) method in which we introduce graph regularization based on manifold learning and symmetric sparse constraints into the traditional low-rank representation (LRR). For the sgLRR method, by means of symmetric constraint and sparse constraint, the effect of raw data noise on low-rank representation is alleviated. Further, sgLRR method preserves the important intrinsic local geometrical structures of the raw data by introducing graph regularization. We apply this method to cluster multi-cancer samples based on gene expression data, which improves the clustering quality. First, the gene expression data are decomposed by sgLRR method. And, a lowest rank representation matrix is obtained, which is symmetric and sparse. Then, an affinity matrix is constructed to perform the multi-cancer sample clustering by using a spectral clustering algorithm, i.e., normalized cuts (Ncuts). Finally, the multi-cancer samples clustering is completed. CONCLUSIONS: A series of comparative experiments demonstrate that the sgLRR method based on low rank representation has a great advantage and remarkable performance in the clustering of multi-cancer samples. BioMed Central 2019-12-30 /pmc/articles/PMC6936083/ /pubmed/31888442 http://dx.doi.org/10.1186/s12859-019-3231-5 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Wang, Juan Lu, Cong-Hai Liu, Jin-Xing Dai, Ling-Yun Kong, Xiang-Zhen Multi-cancer samples clustering via graph regularized low-rank representation method under sparse and symmetric constraints |
title | Multi-cancer samples clustering via graph regularized low-rank representation method under sparse and symmetric constraints |
title_full | Multi-cancer samples clustering via graph regularized low-rank representation method under sparse and symmetric constraints |
title_fullStr | Multi-cancer samples clustering via graph regularized low-rank representation method under sparse and symmetric constraints |
title_full_unstemmed | Multi-cancer samples clustering via graph regularized low-rank representation method under sparse and symmetric constraints |
title_short | Multi-cancer samples clustering via graph regularized low-rank representation method under sparse and symmetric constraints |
title_sort | multi-cancer samples clustering via graph regularized low-rank representation method under sparse and symmetric constraints |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6936083/ https://www.ncbi.nlm.nih.gov/pubmed/31888442 http://dx.doi.org/10.1186/s12859-019-3231-5 |
work_keys_str_mv | AT wangjuan multicancersamplesclusteringviagraphregularizedlowrankrepresentationmethodundersparseandsymmetricconstraints AT luconghai multicancersamplesclusteringviagraphregularizedlowrankrepresentationmethodundersparseandsymmetricconstraints AT liujinxing multicancersamplesclusteringviagraphregularizedlowrankrepresentationmethodundersparseandsymmetricconstraints AT dailingyun multicancersamplesclusteringviagraphregularizedlowrankrepresentationmethodundersparseandsymmetricconstraints AT kongxiangzhen multicancersamplesclusteringviagraphregularizedlowrankrepresentationmethodundersparseandsymmetricconstraints |