Cargando…

Prognostic model development for classification of colorectal adenocarcinoma by using machine learning model based on feature selection technique boruta

Colorectal cancer (CRC) is the third most prevalent cancer type and accounts for nearly one million deaths worldwide. The CRC mRNA gene expression datasets from TCGA and GEO (GSE144259, GSE50760, and GSE87096) were analyzed to find the significant differentially expressed genes (DEGs). These signifi...

Descripción completa

Detalles Bibliográficos
Autores principales: Maurya, Neha Shree, Kushwah, Shikha, Kushwaha, Sandeep, Chawade, Aakash, Mani, Ashutosh
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10115869/
https://www.ncbi.nlm.nih.gov/pubmed/37076536
http://dx.doi.org/10.1038/s41598-023-33327-4
_version_ 1785028300637208576
author Maurya, Neha Shree
Kushwah, Shikha
Kushwaha, Sandeep
Chawade, Aakash
Mani, Ashutosh
author_facet Maurya, Neha Shree
Kushwah, Shikha
Kushwaha, Sandeep
Chawade, Aakash
Mani, Ashutosh
author_sort Maurya, Neha Shree
collection PubMed
description Colorectal cancer (CRC) is the third most prevalent cancer type and accounts for nearly one million deaths worldwide. The CRC mRNA gene expression datasets from TCGA and GEO (GSE144259, GSE50760, and GSE87096) were analyzed to find the significant differentially expressed genes (DEGs). These significant genes were further processed for feature selection through boruta and the confirmed features of importance (genes) were subsequently used for ML-based prognostic classification model development. These genes were analyzed for survival and correlation analysis between final genes and infiltrated immunocytes. A total of 770 CRC samples were included having 78 normal and 692 tumor tissue samples. 170 significant DEGs were identified after DESeq2 analysis along with the topconfects R package. The 33 confirmed features of importance-based RF prognostic classification model have given accuracy, precision, recall, and f1-score of 100% with 0% standard deviation. The overall survival analysis had finalized GLP2R and VSTM2A genes that were significantly downregulated in tumor samples and had a strong correlation with immunocyte infiltration. The involvement of these genes in CRC prognosis was further confirmed on the basis of their biological function and literature analysis. The current findings indicate that GLP2R and VSTM2A may play a significant role in CRC progression and immune response suppression.
format Online
Article
Text
id pubmed-10115869
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-101158692023-04-21 Prognostic model development for classification of colorectal adenocarcinoma by using machine learning model based on feature selection technique boruta Maurya, Neha Shree Kushwah, Shikha Kushwaha, Sandeep Chawade, Aakash Mani, Ashutosh Sci Rep Article Colorectal cancer (CRC) is the third most prevalent cancer type and accounts for nearly one million deaths worldwide. The CRC mRNA gene expression datasets from TCGA and GEO (GSE144259, GSE50760, and GSE87096) were analyzed to find the significant differentially expressed genes (DEGs). These significant genes were further processed for feature selection through boruta and the confirmed features of importance (genes) were subsequently used for ML-based prognostic classification model development. These genes were analyzed for survival and correlation analysis between final genes and infiltrated immunocytes. A total of 770 CRC samples were included having 78 normal and 692 tumor tissue samples. 170 significant DEGs were identified after DESeq2 analysis along with the topconfects R package. The 33 confirmed features of importance-based RF prognostic classification model have given accuracy, precision, recall, and f1-score of 100% with 0% standard deviation. The overall survival analysis had finalized GLP2R and VSTM2A genes that were significantly downregulated in tumor samples and had a strong correlation with immunocyte infiltration. The involvement of these genes in CRC prognosis was further confirmed on the basis of their biological function and literature analysis. The current findings indicate that GLP2R and VSTM2A may play a significant role in CRC progression and immune response suppression. Nature Publishing Group UK 2023-04-19 /pmc/articles/PMC10115869/ /pubmed/37076536 http://dx.doi.org/10.1038/s41598-023-33327-4 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Maurya, Neha Shree
Kushwah, Shikha
Kushwaha, Sandeep
Chawade, Aakash
Mani, Ashutosh
Prognostic model development for classification of colorectal adenocarcinoma by using machine learning model based on feature selection technique boruta
title Prognostic model development for classification of colorectal adenocarcinoma by using machine learning model based on feature selection technique boruta
title_full Prognostic model development for classification of colorectal adenocarcinoma by using machine learning model based on feature selection technique boruta
title_fullStr Prognostic model development for classification of colorectal adenocarcinoma by using machine learning model based on feature selection technique boruta
title_full_unstemmed Prognostic model development for classification of colorectal adenocarcinoma by using machine learning model based on feature selection technique boruta
title_short Prognostic model development for classification of colorectal adenocarcinoma by using machine learning model based on feature selection technique boruta
title_sort prognostic model development for classification of colorectal adenocarcinoma by using machine learning model based on feature selection technique boruta
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10115869/
https://www.ncbi.nlm.nih.gov/pubmed/37076536
http://dx.doi.org/10.1038/s41598-023-33327-4
work_keys_str_mv AT mauryanehashree prognosticmodeldevelopmentforclassificationofcolorectaladenocarcinomabyusingmachinelearningmodelbasedonfeatureselectiontechniqueboruta
AT kushwahshikha prognosticmodeldevelopmentforclassificationofcolorectaladenocarcinomabyusingmachinelearningmodelbasedonfeatureselectiontechniqueboruta
AT kushwahasandeep prognosticmodeldevelopmentforclassificationofcolorectaladenocarcinomabyusingmachinelearningmodelbasedonfeatureselectiontechniqueboruta
AT chawadeaakash prognosticmodeldevelopmentforclassificationofcolorectaladenocarcinomabyusingmachinelearningmodelbasedonfeatureselectiontechniqueboruta
AT maniashutosh prognosticmodeldevelopmentforclassificationofcolorectaladenocarcinomabyusingmachinelearningmodelbasedonfeatureselectiontechniqueboruta