Cargando…

An Integrated Approach for Identifying Molecular Subtypes in Human Colon Cancer Using Gene Expression Data

Identifying molecular subtypes of colorectal cancer (CRC) may allow for more rational, patient-specific treatment. Various studies have identified molecular subtypes for CRC using gene expression data, but they are inconsistent and further research is necessary. From a methodological point of view,...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Wen-Hui, Xie, Ting-Yan, Xie, Guang-Lei, Ren, Zhong-Lu, Li, Jin-Ming
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6115727/
https://www.ncbi.nlm.nih.gov/pubmed/30072645
http://dx.doi.org/10.3390/genes9080397
_version_ 1783351447623041024
author Wang, Wen-Hui
Xie, Ting-Yan
Xie, Guang-Lei
Ren, Zhong-Lu
Li, Jin-Ming
author_facet Wang, Wen-Hui
Xie, Ting-Yan
Xie, Guang-Lei
Ren, Zhong-Lu
Li, Jin-Ming
author_sort Wang, Wen-Hui
collection PubMed
description Identifying molecular subtypes of colorectal cancer (CRC) may allow for more rational, patient-specific treatment. Various studies have identified molecular subtypes for CRC using gene expression data, but they are inconsistent and further research is necessary. From a methodological point of view, a progressive approach is needed to identify molecular subtypes in human colon cancer using gene expression data. We propose an approach to identify the molecular subtypes of colon cancer that integrates denoising by the Bayesian robust principal component analysis (BRPCA) algorithm, hierarchical clustering by the directed bubble hierarchical tree (DBHT) algorithm, and feature gene selection by an improved differential evolution based feature selection method (DEFS(W)) algorithm. In this approach, the normal samples being completely and exclusively clustered into one class is considered to be the standard of reasonable clustering subtypes, and the feature selection pays attention to imbalances of samples among subtypes. With this approach, we identified the molecular subtypes of colon cancer on the mRNA gene expression dataset of 153 colon cancer samples and 19 normal control samples of the Cancer Genome Atlas (TCGA) project. The colon cancer was clustered into 7 subtypes with 44 feature genes. Our approach could identify finer subtypes of colon cancer with fewer feature genes than the other two recent studies and exhibits a generic methodology that might be applied to identify the subtypes of other cancers.
format Online
Article
Text
id pubmed-6115727
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-61157272018-08-31 An Integrated Approach for Identifying Molecular Subtypes in Human Colon Cancer Using Gene Expression Data Wang, Wen-Hui Xie, Ting-Yan Xie, Guang-Lei Ren, Zhong-Lu Li, Jin-Ming Genes (Basel) Article Identifying molecular subtypes of colorectal cancer (CRC) may allow for more rational, patient-specific treatment. Various studies have identified molecular subtypes for CRC using gene expression data, but they are inconsistent and further research is necessary. From a methodological point of view, a progressive approach is needed to identify molecular subtypes in human colon cancer using gene expression data. We propose an approach to identify the molecular subtypes of colon cancer that integrates denoising by the Bayesian robust principal component analysis (BRPCA) algorithm, hierarchical clustering by the directed bubble hierarchical tree (DBHT) algorithm, and feature gene selection by an improved differential evolution based feature selection method (DEFS(W)) algorithm. In this approach, the normal samples being completely and exclusively clustered into one class is considered to be the standard of reasonable clustering subtypes, and the feature selection pays attention to imbalances of samples among subtypes. With this approach, we identified the molecular subtypes of colon cancer on the mRNA gene expression dataset of 153 colon cancer samples and 19 normal control samples of the Cancer Genome Atlas (TCGA) project. The colon cancer was clustered into 7 subtypes with 44 feature genes. Our approach could identify finer subtypes of colon cancer with fewer feature genes than the other two recent studies and exhibits a generic methodology that might be applied to identify the subtypes of other cancers. MDPI 2018-08-02 /pmc/articles/PMC6115727/ /pubmed/30072645 http://dx.doi.org/10.3390/genes9080397 Text en © 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Wang, Wen-Hui
Xie, Ting-Yan
Xie, Guang-Lei
Ren, Zhong-Lu
Li, Jin-Ming
An Integrated Approach for Identifying Molecular Subtypes in Human Colon Cancer Using Gene Expression Data
title An Integrated Approach for Identifying Molecular Subtypes in Human Colon Cancer Using Gene Expression Data
title_full An Integrated Approach for Identifying Molecular Subtypes in Human Colon Cancer Using Gene Expression Data
title_fullStr An Integrated Approach for Identifying Molecular Subtypes in Human Colon Cancer Using Gene Expression Data
title_full_unstemmed An Integrated Approach for Identifying Molecular Subtypes in Human Colon Cancer Using Gene Expression Data
title_short An Integrated Approach for Identifying Molecular Subtypes in Human Colon Cancer Using Gene Expression Data
title_sort integrated approach for identifying molecular subtypes in human colon cancer using gene expression data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6115727/
https://www.ncbi.nlm.nih.gov/pubmed/30072645
http://dx.doi.org/10.3390/genes9080397
work_keys_str_mv AT wangwenhui anintegratedapproachforidentifyingmolecularsubtypesinhumancoloncancerusinggeneexpressiondata
AT xietingyan anintegratedapproachforidentifyingmolecularsubtypesinhumancoloncancerusinggeneexpressiondata
AT xieguanglei anintegratedapproachforidentifyingmolecularsubtypesinhumancoloncancerusinggeneexpressiondata
AT renzhonglu anintegratedapproachforidentifyingmolecularsubtypesinhumancoloncancerusinggeneexpressiondata
AT lijinming anintegratedapproachforidentifyingmolecularsubtypesinhumancoloncancerusinggeneexpressiondata
AT wangwenhui integratedapproachforidentifyingmolecularsubtypesinhumancoloncancerusinggeneexpressiondata
AT xietingyan integratedapproachforidentifyingmolecularsubtypesinhumancoloncancerusinggeneexpressiondata
AT xieguanglei integratedapproachforidentifyingmolecularsubtypesinhumancoloncancerusinggeneexpressiondata
AT renzhonglu integratedapproachforidentifyingmolecularsubtypesinhumancoloncancerusinggeneexpressiondata
AT lijinming integratedapproachforidentifyingmolecularsubtypesinhumancoloncancerusinggeneexpressiondata