Cargando…
Multi-view Subspace Clustering Analysis for Aggregating Multiple Heterogeneous Omics Data
Integration of distinct biological data types could provide a comprehensive view of biological processes or complex diseases. The combinations of molecules responsible for different phenotypes form multiple embedded (expression) subspaces, thus identifying the intrinsic data structure is challenging...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6712585/ https://www.ncbi.nlm.nih.gov/pubmed/31497031 http://dx.doi.org/10.3389/fgene.2019.00744 |
_version_ | 1783446703412609024 |
---|---|
author | Shi, Qianqian Hu, Bing Zeng, Tao Zhang, Chuanchao |
author_facet | Shi, Qianqian Hu, Bing Zeng, Tao Zhang, Chuanchao |
author_sort | Shi, Qianqian |
collection | PubMed |
description | Integration of distinct biological data types could provide a comprehensive view of biological processes or complex diseases. The combinations of molecules responsible for different phenotypes form multiple embedded (expression) subspaces, thus identifying the intrinsic data structure is challenging by regular integration methods. In this paper, we propose a novel framework of “Multi-view Subspace Clustering Analysis (MSCA),” which could measure the local similarities of samples in the same subspace and obtain the global consensus sample patterns (structures) for multiple data types, thereby comprehensively capturing the underlying heterogeneity of samples. Applied to various synthetic datasets, MSCA performs effectively to recognize the predefined sample patterns, and is robust to data noises. Given a real biological dataset, i.e., Cancer Cell Line Encyclopedia (CCLE) data, MSCA successfully identifies cell clusters of common aberrations across cancer types. A remarkable superiority over the state-of-the-art methods, such as iClusterPlus, SNF, and ANF, has also been demonstrated in our simulation and case studies. |
format | Online Article Text |
id | pubmed-6712585 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-67125852019-09-06 Multi-view Subspace Clustering Analysis for Aggregating Multiple Heterogeneous Omics Data Shi, Qianqian Hu, Bing Zeng, Tao Zhang, Chuanchao Front Genet Genetics Integration of distinct biological data types could provide a comprehensive view of biological processes or complex diseases. The combinations of molecules responsible for different phenotypes form multiple embedded (expression) subspaces, thus identifying the intrinsic data structure is challenging by regular integration methods. In this paper, we propose a novel framework of “Multi-view Subspace Clustering Analysis (MSCA),” which could measure the local similarities of samples in the same subspace and obtain the global consensus sample patterns (structures) for multiple data types, thereby comprehensively capturing the underlying heterogeneity of samples. Applied to various synthetic datasets, MSCA performs effectively to recognize the predefined sample patterns, and is robust to data noises. Given a real biological dataset, i.e., Cancer Cell Line Encyclopedia (CCLE) data, MSCA successfully identifies cell clusters of common aberrations across cancer types. A remarkable superiority over the state-of-the-art methods, such as iClusterPlus, SNF, and ANF, has also been demonstrated in our simulation and case studies. Frontiers Media S.A. 2019-08-20 /pmc/articles/PMC6712585/ /pubmed/31497031 http://dx.doi.org/10.3389/fgene.2019.00744 Text en Copyright © 2019 Shi, Hu, Zeng and Zhang http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Shi, Qianqian Hu, Bing Zeng, Tao Zhang, Chuanchao Multi-view Subspace Clustering Analysis for Aggregating Multiple Heterogeneous Omics Data |
title | Multi-view Subspace Clustering Analysis for Aggregating Multiple Heterogeneous Omics Data |
title_full | Multi-view Subspace Clustering Analysis for Aggregating Multiple Heterogeneous Omics Data |
title_fullStr | Multi-view Subspace Clustering Analysis for Aggregating Multiple Heterogeneous Omics Data |
title_full_unstemmed | Multi-view Subspace Clustering Analysis for Aggregating Multiple Heterogeneous Omics Data |
title_short | Multi-view Subspace Clustering Analysis for Aggregating Multiple Heterogeneous Omics Data |
title_sort | multi-view subspace clustering analysis for aggregating multiple heterogeneous omics data |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6712585/ https://www.ncbi.nlm.nih.gov/pubmed/31497031 http://dx.doi.org/10.3389/fgene.2019.00744 |
work_keys_str_mv | AT shiqianqian multiviewsubspaceclusteringanalysisforaggregatingmultipleheterogeneousomicsdata AT hubing multiviewsubspaceclusteringanalysisforaggregatingmultipleheterogeneousomicsdata AT zengtao multiviewsubspaceclusteringanalysisforaggregatingmultipleheterogeneousomicsdata AT zhangchuanchao multiviewsubspaceclusteringanalysisforaggregatingmultipleheterogeneousomicsdata |