Cargando…

A network-assisted co-clustering algorithm to discover cancer subtypes based on gene expression

BACKGROUND: Cancer subtype information is critically important for understanding tumor heterogeneity. Existing methods to identify cancer subtypes have primarily focused on utilizing generic clustering algorithms (such as hierarchical clustering) to identify subtypes based on gene expression data. T...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Yiyi, Gu, Quanquan, Hou, Jack P, Han, Jiawei, Ma, Jian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3916445/
https://www.ncbi.nlm.nih.gov/pubmed/24491042
http://dx.doi.org/10.1186/1471-2105-15-37
_version_ 1782302718875926528
author Liu, Yiyi
Gu, Quanquan
Hou, Jack P
Han, Jiawei
Ma, Jian
author_facet Liu, Yiyi
Gu, Quanquan
Hou, Jack P
Han, Jiawei
Ma, Jian
author_sort Liu, Yiyi
collection PubMed
description BACKGROUND: Cancer subtype information is critically important for understanding tumor heterogeneity. Existing methods to identify cancer subtypes have primarily focused on utilizing generic clustering algorithms (such as hierarchical clustering) to identify subtypes based on gene expression data. The network-level interaction among genes, which is key to understanding the molecular perturbations in cancer, has been rarely considered during the clustering process. The motivation of our work is to develop a method that effectively incorporates molecular interaction networks into the clustering process to improve cancer subtype identification. RESULTS: We have developed a new clustering algorithm for cancer subtype identification, called “network-assisted co-clustering for the identification of cancer subtypes” (NCIS). NCIS combines gene network information to simultaneously group samples and genes into biologically meaningful clusters. Prior to clustering, we assign weights to genes based on their impact in the network. Then a new weighted co-clustering algorithm based on a semi-nonnegative matrix tri-factorization is applied. We evaluated the effectiveness of NCIS on simulated datasets as well as large-scale Breast Cancer and Glioblastoma Multiforme patient samples from The Cancer Genome Atlas (TCGA) project. NCIS was shown to better separate the patient samples into clinically distinct subtypes and achieve higher accuracy on the simulated datasets to tolerate noise, as compared to consensus hierarchical clustering. CONCLUSIONS: The weighted co-clustering approach in NCIS provides a unique solution to incorporate gene network information into the clustering process. Our tool will be useful to comprehensively identify cancer subtypes that would otherwise be obscured by cancer heterogeneity, using high-throughput and high-dimensional gene expression data.
format Online
Article
Text
id pubmed-3916445
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-39164452014-02-24 A network-assisted co-clustering algorithm to discover cancer subtypes based on gene expression Liu, Yiyi Gu, Quanquan Hou, Jack P Han, Jiawei Ma, Jian BMC Bioinformatics Methodology Article BACKGROUND: Cancer subtype information is critically important for understanding tumor heterogeneity. Existing methods to identify cancer subtypes have primarily focused on utilizing generic clustering algorithms (such as hierarchical clustering) to identify subtypes based on gene expression data. The network-level interaction among genes, which is key to understanding the molecular perturbations in cancer, has been rarely considered during the clustering process. The motivation of our work is to develop a method that effectively incorporates molecular interaction networks into the clustering process to improve cancer subtype identification. RESULTS: We have developed a new clustering algorithm for cancer subtype identification, called “network-assisted co-clustering for the identification of cancer subtypes” (NCIS). NCIS combines gene network information to simultaneously group samples and genes into biologically meaningful clusters. Prior to clustering, we assign weights to genes based on their impact in the network. Then a new weighted co-clustering algorithm based on a semi-nonnegative matrix tri-factorization is applied. We evaluated the effectiveness of NCIS on simulated datasets as well as large-scale Breast Cancer and Glioblastoma Multiforme patient samples from The Cancer Genome Atlas (TCGA) project. NCIS was shown to better separate the patient samples into clinically distinct subtypes and achieve higher accuracy on the simulated datasets to tolerate noise, as compared to consensus hierarchical clustering. CONCLUSIONS: The weighted co-clustering approach in NCIS provides a unique solution to incorporate gene network information into the clustering process. Our tool will be useful to comprehensively identify cancer subtypes that would otherwise be obscured by cancer heterogeneity, using high-throughput and high-dimensional gene expression data. BioMed Central 2014-02-04 /pmc/articles/PMC3916445/ /pubmed/24491042 http://dx.doi.org/10.1186/1471-2105-15-37 Text en Copyright © 2014 Liu et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Liu, Yiyi
Gu, Quanquan
Hou, Jack P
Han, Jiawei
Ma, Jian
A network-assisted co-clustering algorithm to discover cancer subtypes based on gene expression
title A network-assisted co-clustering algorithm to discover cancer subtypes based on gene expression
title_full A network-assisted co-clustering algorithm to discover cancer subtypes based on gene expression
title_fullStr A network-assisted co-clustering algorithm to discover cancer subtypes based on gene expression
title_full_unstemmed A network-assisted co-clustering algorithm to discover cancer subtypes based on gene expression
title_short A network-assisted co-clustering algorithm to discover cancer subtypes based on gene expression
title_sort network-assisted co-clustering algorithm to discover cancer subtypes based on gene expression
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3916445/
https://www.ncbi.nlm.nih.gov/pubmed/24491042
http://dx.doi.org/10.1186/1471-2105-15-37
work_keys_str_mv AT liuyiyi anetworkassistedcoclusteringalgorithmtodiscovercancersubtypesbasedongeneexpression
AT guquanquan anetworkassistedcoclusteringalgorithmtodiscovercancersubtypesbasedongeneexpression
AT houjackp anetworkassistedcoclusteringalgorithmtodiscovercancersubtypesbasedongeneexpression
AT hanjiawei anetworkassistedcoclusteringalgorithmtodiscovercancersubtypesbasedongeneexpression
AT majian anetworkassistedcoclusteringalgorithmtodiscovercancersubtypesbasedongeneexpression
AT liuyiyi networkassistedcoclusteringalgorithmtodiscovercancersubtypesbasedongeneexpression
AT guquanquan networkassistedcoclusteringalgorithmtodiscovercancersubtypesbasedongeneexpression
AT houjackp networkassistedcoclusteringalgorithmtodiscovercancersubtypesbasedongeneexpression
AT hanjiawei networkassistedcoclusteringalgorithmtodiscovercancersubtypesbasedongeneexpression
AT majian networkassistedcoclusteringalgorithmtodiscovercancersubtypesbasedongeneexpression