Cargando…

Identifying Stage II Colorectal Cancer Recurrence Associated Genes by Microarray Meta-Analysis and Building Predictive Models with Machine Learning Algorithms

BACKGROUND: Stage II colorectal cancer patients had heterogeneous prognosis, and patients with recurrent events had poor survival. In this study, we aimed to identify stage II colorectal cancer recurrence associated genes by microarray meta-analysis and build predictive models to stratify patients&#...

Descripción completa

Detalles Bibliográficos
Autores principales: Lu, Wei, Pan, Xiang, Dai, Siqi, Fu, Dongliang, Hwang, Maxwell, Zhu, Yingshuang, Zhang, Lina, Wei, Jingsun, Kong, Xiangxing, Li, Jun, Xiao, Qian, Ding, Kefeng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7889382/
https://www.ncbi.nlm.nih.gov/pubmed/33628243
http://dx.doi.org/10.1155/2021/6657397
_version_ 1783652297381773312
author Lu, Wei
Pan, Xiang
Dai, Siqi
Fu, Dongliang
Hwang, Maxwell
Zhu, Yingshuang
Zhang, Lina
Wei, Jingsun
Kong, Xiangxing
Li, Jun
Xiao, Qian
Ding, Kefeng
author_facet Lu, Wei
Pan, Xiang
Dai, Siqi
Fu, Dongliang
Hwang, Maxwell
Zhu, Yingshuang
Zhang, Lina
Wei, Jingsun
Kong, Xiangxing
Li, Jun
Xiao, Qian
Ding, Kefeng
author_sort Lu, Wei
collection PubMed
description BACKGROUND: Stage II colorectal cancer patients had heterogeneous prognosis, and patients with recurrent events had poor survival. In this study, we aimed to identify stage II colorectal cancer recurrence associated genes by microarray meta-analysis and build predictive models to stratify patients' recurrence-free survival. METHODS: We searched the GEO database to retrieve eligible microarray datasets. The microarray meta-analysis was used to identify universal recurrence associated genes. Total samples were randomly divided into the training set and the test set. Two survival models (lasso Cox model and random survival forest model) were trained in the training set, and AUC values of the time-dependent receiver operating characteristic (ROC) curves were calculated. Survival analysis was performed to determine whether there was significant difference between the predicted high and low risk groups in the test set. RESULTS: Six datasets containing 651 stage II colorectal cancer patients were included in this study. The microarray meta-analysis identified 479 recurrence associated genes. KEGG and GO enrichment analysis showed that G protein-coupled glutamate receptor binding and Hedgehog signaling were significantly enriched. AUC values of the lasso Cox model and the random survival forest model were 0.815 and 0.993 at 60 months, respectively. In addition, the random survival forest model demonstrated that the effects of gene expression on the recurrence-free survival probability were nonlinear. According to the risk scores computed by the random survival forest model, the high risk group had significantly higher recurrence risk than the low risk group (HR = 1.824, 95% CI: 1.079–3.084, p = 0.025). CONCLUSIONS: We identified 479 stage II colorectal cancer recurrence associated genes by microarray meta-analysis. The random survival forest model which was based on the recurrence associated gene signature could strongly predict the recurrence risk of stage II colorectal cancer patients.
format Online
Article
Text
id pubmed-7889382
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-78893822021-02-23 Identifying Stage II Colorectal Cancer Recurrence Associated Genes by Microarray Meta-Analysis and Building Predictive Models with Machine Learning Algorithms Lu, Wei Pan, Xiang Dai, Siqi Fu, Dongliang Hwang, Maxwell Zhu, Yingshuang Zhang, Lina Wei, Jingsun Kong, Xiangxing Li, Jun Xiao, Qian Ding, Kefeng J Oncol Research Article BACKGROUND: Stage II colorectal cancer patients had heterogeneous prognosis, and patients with recurrent events had poor survival. In this study, we aimed to identify stage II colorectal cancer recurrence associated genes by microarray meta-analysis and build predictive models to stratify patients' recurrence-free survival. METHODS: We searched the GEO database to retrieve eligible microarray datasets. The microarray meta-analysis was used to identify universal recurrence associated genes. Total samples were randomly divided into the training set and the test set. Two survival models (lasso Cox model and random survival forest model) were trained in the training set, and AUC values of the time-dependent receiver operating characteristic (ROC) curves were calculated. Survival analysis was performed to determine whether there was significant difference between the predicted high and low risk groups in the test set. RESULTS: Six datasets containing 651 stage II colorectal cancer patients were included in this study. The microarray meta-analysis identified 479 recurrence associated genes. KEGG and GO enrichment analysis showed that G protein-coupled glutamate receptor binding and Hedgehog signaling were significantly enriched. AUC values of the lasso Cox model and the random survival forest model were 0.815 and 0.993 at 60 months, respectively. In addition, the random survival forest model demonstrated that the effects of gene expression on the recurrence-free survival probability were nonlinear. According to the risk scores computed by the random survival forest model, the high risk group had significantly higher recurrence risk than the low risk group (HR = 1.824, 95% CI: 1.079–3.084, p = 0.025). CONCLUSIONS: We identified 479 stage II colorectal cancer recurrence associated genes by microarray meta-analysis. The random survival forest model which was based on the recurrence associated gene signature could strongly predict the recurrence risk of stage II colorectal cancer patients. Hindawi 2021-02-10 /pmc/articles/PMC7889382/ /pubmed/33628243 http://dx.doi.org/10.1155/2021/6657397 Text en Copyright © 2021 Wei Lu et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Lu, Wei
Pan, Xiang
Dai, Siqi
Fu, Dongliang
Hwang, Maxwell
Zhu, Yingshuang
Zhang, Lina
Wei, Jingsun
Kong, Xiangxing
Li, Jun
Xiao, Qian
Ding, Kefeng
Identifying Stage II Colorectal Cancer Recurrence Associated Genes by Microarray Meta-Analysis and Building Predictive Models with Machine Learning Algorithms
title Identifying Stage II Colorectal Cancer Recurrence Associated Genes by Microarray Meta-Analysis and Building Predictive Models with Machine Learning Algorithms
title_full Identifying Stage II Colorectal Cancer Recurrence Associated Genes by Microarray Meta-Analysis and Building Predictive Models with Machine Learning Algorithms
title_fullStr Identifying Stage II Colorectal Cancer Recurrence Associated Genes by Microarray Meta-Analysis and Building Predictive Models with Machine Learning Algorithms
title_full_unstemmed Identifying Stage II Colorectal Cancer Recurrence Associated Genes by Microarray Meta-Analysis and Building Predictive Models with Machine Learning Algorithms
title_short Identifying Stage II Colorectal Cancer Recurrence Associated Genes by Microarray Meta-Analysis and Building Predictive Models with Machine Learning Algorithms
title_sort identifying stage ii colorectal cancer recurrence associated genes by microarray meta-analysis and building predictive models with machine learning algorithms
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7889382/
https://www.ncbi.nlm.nih.gov/pubmed/33628243
http://dx.doi.org/10.1155/2021/6657397
work_keys_str_mv AT luwei identifyingstageiicolorectalcancerrecurrenceassociatedgenesbymicroarraymetaanalysisandbuildingpredictivemodelswithmachinelearningalgorithms
AT panxiang identifyingstageiicolorectalcancerrecurrenceassociatedgenesbymicroarraymetaanalysisandbuildingpredictivemodelswithmachinelearningalgorithms
AT daisiqi identifyingstageiicolorectalcancerrecurrenceassociatedgenesbymicroarraymetaanalysisandbuildingpredictivemodelswithmachinelearningalgorithms
AT fudongliang identifyingstageiicolorectalcancerrecurrenceassociatedgenesbymicroarraymetaanalysisandbuildingpredictivemodelswithmachinelearningalgorithms
AT hwangmaxwell identifyingstageiicolorectalcancerrecurrenceassociatedgenesbymicroarraymetaanalysisandbuildingpredictivemodelswithmachinelearningalgorithms
AT zhuyingshuang identifyingstageiicolorectalcancerrecurrenceassociatedgenesbymicroarraymetaanalysisandbuildingpredictivemodelswithmachinelearningalgorithms
AT zhanglina identifyingstageiicolorectalcancerrecurrenceassociatedgenesbymicroarraymetaanalysisandbuildingpredictivemodelswithmachinelearningalgorithms
AT weijingsun identifyingstageiicolorectalcancerrecurrenceassociatedgenesbymicroarraymetaanalysisandbuildingpredictivemodelswithmachinelearningalgorithms
AT kongxiangxing identifyingstageiicolorectalcancerrecurrenceassociatedgenesbymicroarraymetaanalysisandbuildingpredictivemodelswithmachinelearningalgorithms
AT lijun identifyingstageiicolorectalcancerrecurrenceassociatedgenesbymicroarraymetaanalysisandbuildingpredictivemodelswithmachinelearningalgorithms
AT xiaoqian identifyingstageiicolorectalcancerrecurrenceassociatedgenesbymicroarraymetaanalysisandbuildingpredictivemodelswithmachinelearningalgorithms
AT dingkefeng identifyingstageiicolorectalcancerrecurrenceassociatedgenesbymicroarraymetaanalysisandbuildingpredictivemodelswithmachinelearningalgorithms