Cargando…
Discovering the molecular differences between right- and left-sided colon cancer using machine learning methods
BACKGROUND: In recent years, the differences between left-sided colon cancer (LCC) and right-sided colon cancer (RCC) have received increasing attention due to the clinicopathological variation between them. However, some of these differences have remained unclear and conflicting results have been r...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7574488/ https://www.ncbi.nlm.nih.gov/pubmed/33076847 http://dx.doi.org/10.1186/s12885-020-07507-8 |
_version_ | 1783597647500673024 |
---|---|
author | Jiang, Yimei Yan, Xiaowei Liu, Kun Shi, Yiqing Wang, Changgang Hu, Jiele Li, You Wu, Qinghua Xiang, Ming Zhao, Ren |
author_facet | Jiang, Yimei Yan, Xiaowei Liu, Kun Shi, Yiqing Wang, Changgang Hu, Jiele Li, You Wu, Qinghua Xiang, Ming Zhao, Ren |
author_sort | Jiang, Yimei |
collection | PubMed |
description | BACKGROUND: In recent years, the differences between left-sided colon cancer (LCC) and right-sided colon cancer (RCC) have received increasing attention due to the clinicopathological variation between them. However, some of these differences have remained unclear and conflicting results have been reported. METHODS: From The Cancer Genome Atlas (TCGA), we obtained RNA sequencing data and gene mutation data on 323 and 283 colon cancer patients, respectively. Differential analysis was firstly done on gene expression data and mutation data between LCC and RCC, separately. Machine learning (ML) methods were then used to select key genes or mutations as features to construct models to classify LCC and RCC patients. Finally, we conducted correlation analysis to identify the correlations between differentially expressed genes (DEGs) and mutations using logistic regression (LR) models. RESULTS: We found distinct gene mutation and expression patterns between LCC and RCC patients and further selected the 30 most important mutations and 17 most important gene expression features using ML methods. The classification models created using these features classified LCC and RCC patients with high accuracy (areas under the curve (AUC) of 0.8 and 0.96 for mutation and gene expression data, respectively). The expression of PRAC1 and BRAF V600E mutation (rs113488022) were the most important feature for each model. Correlations of mutations and gene expression data were also identified using LR models. Among them, rs113488022 was found to have significance relevance to the expression of four genes, and thus should be focused on in further study. CONCLUSIONS: On the basis of ML methods, we found some key molecular differences between LCC and RCC, which could differentiate these two groups of patients with high accuracy. These differences might be key factors behind the variation in clinical features between LCC and RCC and thus help to improve treatment, such as determining the appropriate therapy for patients. |
format | Online Article Text |
id | pubmed-7574488 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-75744882020-10-20 Discovering the molecular differences between right- and left-sided colon cancer using machine learning methods Jiang, Yimei Yan, Xiaowei Liu, Kun Shi, Yiqing Wang, Changgang Hu, Jiele Li, You Wu, Qinghua Xiang, Ming Zhao, Ren BMC Cancer Research Article BACKGROUND: In recent years, the differences between left-sided colon cancer (LCC) and right-sided colon cancer (RCC) have received increasing attention due to the clinicopathological variation between them. However, some of these differences have remained unclear and conflicting results have been reported. METHODS: From The Cancer Genome Atlas (TCGA), we obtained RNA sequencing data and gene mutation data on 323 and 283 colon cancer patients, respectively. Differential analysis was firstly done on gene expression data and mutation data between LCC and RCC, separately. Machine learning (ML) methods were then used to select key genes or mutations as features to construct models to classify LCC and RCC patients. Finally, we conducted correlation analysis to identify the correlations between differentially expressed genes (DEGs) and mutations using logistic regression (LR) models. RESULTS: We found distinct gene mutation and expression patterns between LCC and RCC patients and further selected the 30 most important mutations and 17 most important gene expression features using ML methods. The classification models created using these features classified LCC and RCC patients with high accuracy (areas under the curve (AUC) of 0.8 and 0.96 for mutation and gene expression data, respectively). The expression of PRAC1 and BRAF V600E mutation (rs113488022) were the most important feature for each model. Correlations of mutations and gene expression data were also identified using LR models. Among them, rs113488022 was found to have significance relevance to the expression of four genes, and thus should be focused on in further study. CONCLUSIONS: On the basis of ML methods, we found some key molecular differences between LCC and RCC, which could differentiate these two groups of patients with high accuracy. These differences might be key factors behind the variation in clinical features between LCC and RCC and thus help to improve treatment, such as determining the appropriate therapy for patients. BioMed Central 2020-10-19 /pmc/articles/PMC7574488/ /pubmed/33076847 http://dx.doi.org/10.1186/s12885-020-07507-8 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Article Jiang, Yimei Yan, Xiaowei Liu, Kun Shi, Yiqing Wang, Changgang Hu, Jiele Li, You Wu, Qinghua Xiang, Ming Zhao, Ren Discovering the molecular differences between right- and left-sided colon cancer using machine learning methods |
title | Discovering the molecular differences between right- and left-sided colon cancer using machine learning methods |
title_full | Discovering the molecular differences between right- and left-sided colon cancer using machine learning methods |
title_fullStr | Discovering the molecular differences between right- and left-sided colon cancer using machine learning methods |
title_full_unstemmed | Discovering the molecular differences between right- and left-sided colon cancer using machine learning methods |
title_short | Discovering the molecular differences between right- and left-sided colon cancer using machine learning methods |
title_sort | discovering the molecular differences between right- and left-sided colon cancer using machine learning methods |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7574488/ https://www.ncbi.nlm.nih.gov/pubmed/33076847 http://dx.doi.org/10.1186/s12885-020-07507-8 |
work_keys_str_mv | AT jiangyimei discoveringthemoleculardifferencesbetweenrightandleftsidedcoloncancerusingmachinelearningmethods AT yanxiaowei discoveringthemoleculardifferencesbetweenrightandleftsidedcoloncancerusingmachinelearningmethods AT liukun discoveringthemoleculardifferencesbetweenrightandleftsidedcoloncancerusingmachinelearningmethods AT shiyiqing discoveringthemoleculardifferencesbetweenrightandleftsidedcoloncancerusingmachinelearningmethods AT wangchanggang discoveringthemoleculardifferencesbetweenrightandleftsidedcoloncancerusingmachinelearningmethods AT hujiele discoveringthemoleculardifferencesbetweenrightandleftsidedcoloncancerusingmachinelearningmethods AT liyou discoveringthemoleculardifferencesbetweenrightandleftsidedcoloncancerusingmachinelearningmethods AT wuqinghua discoveringthemoleculardifferencesbetweenrightandleftsidedcoloncancerusingmachinelearningmethods AT xiangming discoveringthemoleculardifferencesbetweenrightandleftsidedcoloncancerusingmachinelearningmethods AT zhaoren discoveringthemoleculardifferencesbetweenrightandleftsidedcoloncancerusingmachinelearningmethods |