Cargando…
Prognostic model development for classification of colorectal adenocarcinoma by using machine learning model based on feature selection technique boruta
Colorectal cancer (CRC) is the third most prevalent cancer type and accounts for nearly one million deaths worldwide. The CRC mRNA gene expression datasets from TCGA and GEO (GSE144259, GSE50760, and GSE87096) were analyzed to find the significant differentially expressed genes (DEGs). These signifi...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10115869/ https://www.ncbi.nlm.nih.gov/pubmed/37076536 http://dx.doi.org/10.1038/s41598-023-33327-4 |
_version_ | 1785028300637208576 |
---|---|
author | Maurya, Neha Shree Kushwah, Shikha Kushwaha, Sandeep Chawade, Aakash Mani, Ashutosh |
author_facet | Maurya, Neha Shree Kushwah, Shikha Kushwaha, Sandeep Chawade, Aakash Mani, Ashutosh |
author_sort | Maurya, Neha Shree |
collection | PubMed |
description | Colorectal cancer (CRC) is the third most prevalent cancer type and accounts for nearly one million deaths worldwide. The CRC mRNA gene expression datasets from TCGA and GEO (GSE144259, GSE50760, and GSE87096) were analyzed to find the significant differentially expressed genes (DEGs). These significant genes were further processed for feature selection through boruta and the confirmed features of importance (genes) were subsequently used for ML-based prognostic classification model development. These genes were analyzed for survival and correlation analysis between final genes and infiltrated immunocytes. A total of 770 CRC samples were included having 78 normal and 692 tumor tissue samples. 170 significant DEGs were identified after DESeq2 analysis along with the topconfects R package. The 33 confirmed features of importance-based RF prognostic classification model have given accuracy, precision, recall, and f1-score of 100% with 0% standard deviation. The overall survival analysis had finalized GLP2R and VSTM2A genes that were significantly downregulated in tumor samples and had a strong correlation with immunocyte infiltration. The involvement of these genes in CRC prognosis was further confirmed on the basis of their biological function and literature analysis. The current findings indicate that GLP2R and VSTM2A may play a significant role in CRC progression and immune response suppression. |
format | Online Article Text |
id | pubmed-10115869 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-101158692023-04-21 Prognostic model development for classification of colorectal adenocarcinoma by using machine learning model based on feature selection technique boruta Maurya, Neha Shree Kushwah, Shikha Kushwaha, Sandeep Chawade, Aakash Mani, Ashutosh Sci Rep Article Colorectal cancer (CRC) is the third most prevalent cancer type and accounts for nearly one million deaths worldwide. The CRC mRNA gene expression datasets from TCGA and GEO (GSE144259, GSE50760, and GSE87096) were analyzed to find the significant differentially expressed genes (DEGs). These significant genes were further processed for feature selection through boruta and the confirmed features of importance (genes) were subsequently used for ML-based prognostic classification model development. These genes were analyzed for survival and correlation analysis between final genes and infiltrated immunocytes. A total of 770 CRC samples were included having 78 normal and 692 tumor tissue samples. 170 significant DEGs were identified after DESeq2 analysis along with the topconfects R package. The 33 confirmed features of importance-based RF prognostic classification model have given accuracy, precision, recall, and f1-score of 100% with 0% standard deviation. The overall survival analysis had finalized GLP2R and VSTM2A genes that were significantly downregulated in tumor samples and had a strong correlation with immunocyte infiltration. The involvement of these genes in CRC prognosis was further confirmed on the basis of their biological function and literature analysis. The current findings indicate that GLP2R and VSTM2A may play a significant role in CRC progression and immune response suppression. Nature Publishing Group UK 2023-04-19 /pmc/articles/PMC10115869/ /pubmed/37076536 http://dx.doi.org/10.1038/s41598-023-33327-4 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Maurya, Neha Shree Kushwah, Shikha Kushwaha, Sandeep Chawade, Aakash Mani, Ashutosh Prognostic model development for classification of colorectal adenocarcinoma by using machine learning model based on feature selection technique boruta |
title | Prognostic model development for classification of colorectal adenocarcinoma by using machine learning model based on feature selection technique boruta |
title_full | Prognostic model development for classification of colorectal adenocarcinoma by using machine learning model based on feature selection technique boruta |
title_fullStr | Prognostic model development for classification of colorectal adenocarcinoma by using machine learning model based on feature selection technique boruta |
title_full_unstemmed | Prognostic model development for classification of colorectal adenocarcinoma by using machine learning model based on feature selection technique boruta |
title_short | Prognostic model development for classification of colorectal adenocarcinoma by using machine learning model based on feature selection technique boruta |
title_sort | prognostic model development for classification of colorectal adenocarcinoma by using machine learning model based on feature selection technique boruta |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10115869/ https://www.ncbi.nlm.nih.gov/pubmed/37076536 http://dx.doi.org/10.1038/s41598-023-33327-4 |
work_keys_str_mv | AT mauryanehashree prognosticmodeldevelopmentforclassificationofcolorectaladenocarcinomabyusingmachinelearningmodelbasedonfeatureselectiontechniqueboruta AT kushwahshikha prognosticmodeldevelopmentforclassificationofcolorectaladenocarcinomabyusingmachinelearningmodelbasedonfeatureselectiontechniqueboruta AT kushwahasandeep prognosticmodeldevelopmentforclassificationofcolorectaladenocarcinomabyusingmachinelearningmodelbasedonfeatureselectiontechniqueboruta AT chawadeaakash prognosticmodeldevelopmentforclassificationofcolorectaladenocarcinomabyusingmachinelearningmodelbasedonfeatureselectiontechniqueboruta AT maniashutosh prognosticmodeldevelopmentforclassificationofcolorectaladenocarcinomabyusingmachinelearningmodelbasedonfeatureselectiontechniqueboruta |