Cargando…

Multi-View Spectral Clustering Based on Multi-Smooth Representation Fusion for Cancer Subtype Prediction

It is a vital task to design an integrated machine learning model to discover cancer subtypes and understand the heterogeneity of cancer based on multiple omics data. In recent years, some multi-view clustering algorithms have been proposed and applied to the prediction of cancer subtypes. Among the...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Jian, Ge, Shuguang, Cheng, Yuhu, Wang, Xuesong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8450448/
https://www.ncbi.nlm.nih.gov/pubmed/34552619
http://dx.doi.org/10.3389/fgene.2021.718915
_version_ 1784569650414092288
author Liu, Jian
Ge, Shuguang
Cheng, Yuhu
Wang, Xuesong
author_facet Liu, Jian
Ge, Shuguang
Cheng, Yuhu
Wang, Xuesong
author_sort Liu, Jian
collection PubMed
description It is a vital task to design an integrated machine learning model to discover cancer subtypes and understand the heterogeneity of cancer based on multiple omics data. In recent years, some multi-view clustering algorithms have been proposed and applied to the prediction of cancer subtypes. Among them, the multi-view clustering methods based on graph learning are widely concerned. These multi-view approaches usually have one or more of the following problems. Many multi-view algorithms use the original omics data matrix to construct the similarity matrix and ignore the learning of the similarity matrix. They separate the data clustering process from the graph learning process, resulting in a highly dependent clustering performance on the predefined graph. In the process of graph fusion, these methods simply take the average value of the affinity graph of multiple views to represent the result of the fusion graph, and the rich heterogeneous information is not fully utilized. To solve the above problems, in this paper, a Multi-view Spectral Clustering Based on Multi-smooth Representation Fusion (MRF-MSC) method was proposed. Firstly, MRF-MSC constructs a smooth representation for each data type, which can be viewed as a sample (patient) similarity matrix. The smooth representation can explicitly enhance the grouping effect. Secondly, MRF-MSC integrates the smooth representation of multiple omics data to form a similarity matrix containing all biological data information through graph fusion. In addition, MRF-MSC adaptively gives weight factors to the smooth regularization representation of each omics data by using the self-weighting method. Finally, MRF-MSC imposes constrained Laplacian rank on the fusion similarity matrix to get a better cluster structure. The above problems can be transformed into spectral clustering for solving, and the clustering results can be obtained. MRF-MSC unifies the above process of graph construction, graph fusion and spectral clustering under one framework, which can learn better data representation and high-quality graphs, so as to achieve better clustering effect. In the experiment, MRF-MSC obtained good experimental results on the TCGA cancer data sets.
format Online
Article
Text
id pubmed-8450448
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-84504482021-09-21 Multi-View Spectral Clustering Based on Multi-Smooth Representation Fusion for Cancer Subtype Prediction Liu, Jian Ge, Shuguang Cheng, Yuhu Wang, Xuesong Front Genet Genetics It is a vital task to design an integrated machine learning model to discover cancer subtypes and understand the heterogeneity of cancer based on multiple omics data. In recent years, some multi-view clustering algorithms have been proposed and applied to the prediction of cancer subtypes. Among them, the multi-view clustering methods based on graph learning are widely concerned. These multi-view approaches usually have one or more of the following problems. Many multi-view algorithms use the original omics data matrix to construct the similarity matrix and ignore the learning of the similarity matrix. They separate the data clustering process from the graph learning process, resulting in a highly dependent clustering performance on the predefined graph. In the process of graph fusion, these methods simply take the average value of the affinity graph of multiple views to represent the result of the fusion graph, and the rich heterogeneous information is not fully utilized. To solve the above problems, in this paper, a Multi-view Spectral Clustering Based on Multi-smooth Representation Fusion (MRF-MSC) method was proposed. Firstly, MRF-MSC constructs a smooth representation for each data type, which can be viewed as a sample (patient) similarity matrix. The smooth representation can explicitly enhance the grouping effect. Secondly, MRF-MSC integrates the smooth representation of multiple omics data to form a similarity matrix containing all biological data information through graph fusion. In addition, MRF-MSC adaptively gives weight factors to the smooth regularization representation of each omics data by using the self-weighting method. Finally, MRF-MSC imposes constrained Laplacian rank on the fusion similarity matrix to get a better cluster structure. The above problems can be transformed into spectral clustering for solving, and the clustering results can be obtained. MRF-MSC unifies the above process of graph construction, graph fusion and spectral clustering under one framework, which can learn better data representation and high-quality graphs, so as to achieve better clustering effect. In the experiment, MRF-MSC obtained good experimental results on the TCGA cancer data sets. Frontiers Media S.A. 2021-09-06 /pmc/articles/PMC8450448/ /pubmed/34552619 http://dx.doi.org/10.3389/fgene.2021.718915 Text en Copyright © 2021 Liu, Ge, Cheng and Wang. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Liu, Jian
Ge, Shuguang
Cheng, Yuhu
Wang, Xuesong
Multi-View Spectral Clustering Based on Multi-Smooth Representation Fusion for Cancer Subtype Prediction
title Multi-View Spectral Clustering Based on Multi-Smooth Representation Fusion for Cancer Subtype Prediction
title_full Multi-View Spectral Clustering Based on Multi-Smooth Representation Fusion for Cancer Subtype Prediction
title_fullStr Multi-View Spectral Clustering Based on Multi-Smooth Representation Fusion for Cancer Subtype Prediction
title_full_unstemmed Multi-View Spectral Clustering Based on Multi-Smooth Representation Fusion for Cancer Subtype Prediction
title_short Multi-View Spectral Clustering Based on Multi-Smooth Representation Fusion for Cancer Subtype Prediction
title_sort multi-view spectral clustering based on multi-smooth representation fusion for cancer subtype prediction
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8450448/
https://www.ncbi.nlm.nih.gov/pubmed/34552619
http://dx.doi.org/10.3389/fgene.2021.718915
work_keys_str_mv AT liujian multiviewspectralclusteringbasedonmultismoothrepresentationfusionforcancersubtypeprediction
AT geshuguang multiviewspectralclusteringbasedonmultismoothrepresentationfusionforcancersubtypeprediction
AT chengyuhu multiviewspectralclusteringbasedonmultismoothrepresentationfusionforcancersubtypeprediction
AT wangxuesong multiviewspectralclusteringbasedonmultismoothrepresentationfusionforcancersubtypeprediction