Cargando…

Identification of the diagnostic genes and immune cell infiltration characteristics of gastric cancer using bioinformatics analysis and machine learning

Background: Finding reliable diagnostic markers for gastric cancer (GC) is important. This work uses machine learning (ML) to identify GC diagnostic genes and investigate their connection with immune cell infiltration. Methods: We downloaded eight GC-related datasets from GEO, TCGA, and GTEx. GSE139...

Descripción completa

Detalles Bibliográficos
Autores principales: Xie, Rongjun, Liu, Longfei, Lu, Xianzhou, He, Chengjian, Li, Guoxin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9845288/
https://www.ncbi.nlm.nih.gov/pubmed/36685898
http://dx.doi.org/10.3389/fgene.2022.1067524
_version_ 1784870867385188352
author Xie, Rongjun
Liu, Longfei
Lu, Xianzhou
He, Chengjian
Li, Guoxin
author_facet Xie, Rongjun
Liu, Longfei
Lu, Xianzhou
He, Chengjian
Li, Guoxin
author_sort Xie, Rongjun
collection PubMed
description Background: Finding reliable diagnostic markers for gastric cancer (GC) is important. This work uses machine learning (ML) to identify GC diagnostic genes and investigate their connection with immune cell infiltration. Methods: We downloaded eight GC-related datasets from GEO, TCGA, and GTEx. GSE13911, GSE15459, GSE19826, GSE54129, and GSE79973 were used as the training set, GSE66229 as the validation set A, and TCGA & GTEx as the validation set B. First, the training set screened differentially expressed genes (DEGs), and gene ontology (GO), kyoto encyclopedia of genes and genomes (KEGG), disease Ontology (DO), and gene set enrichment analysis (GSEA) analyses were performed. Then, the candidate diagnostic genes were screened by LASSO and SVM-RFE algorithms, and receiver operating characteristic (ROC) curves evaluated the diagnostic efficacy. Then, the infiltration characteristics of immune cells in GC samples were analyzed by CIBERSORT, and correlation analysis was performed. Finally, mutation and survival analyses were performed for diagnostic genes. Results: We found 207 up-regulated genes and 349 down-regulated genes among 556 DEGs. gene ontology analysis significantly enriched 413 functional annotations, including 310 biological processes, 23 cellular components, and 80 molecular functions. Six of these biological processes are closely related to immunity. KEGG analysis significantly enriched 11 signaling pathways. 244 diseases were closely related to Ontology analysis. Multiple entries of the gene set enrichment analysis analysis were closely related to immunity. Machine learning screened eight candidate diagnostic genes and further validated them to identify ABCA8, COL4A1, FAP, LY6E, MAMDC2, and TMEM100 as diagnostic genes. Six diagnostic genes were mutated to some extent in GC. ABCA8, COL4A1, LY6E, MAMDC2, TMEM100 had prognostic value. Conclusion: We screened six diagnostic genes for gastric cancer through bioinformatic analysis and machine learning, which are intimately related to immune cell infiltration and have a definite prognostic value.
format Online
Article
Text
id pubmed-9845288
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-98452882023-01-19 Identification of the diagnostic genes and immune cell infiltration characteristics of gastric cancer using bioinformatics analysis and machine learning Xie, Rongjun Liu, Longfei Lu, Xianzhou He, Chengjian Li, Guoxin Front Genet Genetics Background: Finding reliable diagnostic markers for gastric cancer (GC) is important. This work uses machine learning (ML) to identify GC diagnostic genes and investigate their connection with immune cell infiltration. Methods: We downloaded eight GC-related datasets from GEO, TCGA, and GTEx. GSE13911, GSE15459, GSE19826, GSE54129, and GSE79973 were used as the training set, GSE66229 as the validation set A, and TCGA & GTEx as the validation set B. First, the training set screened differentially expressed genes (DEGs), and gene ontology (GO), kyoto encyclopedia of genes and genomes (KEGG), disease Ontology (DO), and gene set enrichment analysis (GSEA) analyses were performed. Then, the candidate diagnostic genes were screened by LASSO and SVM-RFE algorithms, and receiver operating characteristic (ROC) curves evaluated the diagnostic efficacy. Then, the infiltration characteristics of immune cells in GC samples were analyzed by CIBERSORT, and correlation analysis was performed. Finally, mutation and survival analyses were performed for diagnostic genes. Results: We found 207 up-regulated genes and 349 down-regulated genes among 556 DEGs. gene ontology analysis significantly enriched 413 functional annotations, including 310 biological processes, 23 cellular components, and 80 molecular functions. Six of these biological processes are closely related to immunity. KEGG analysis significantly enriched 11 signaling pathways. 244 diseases were closely related to Ontology analysis. Multiple entries of the gene set enrichment analysis analysis were closely related to immunity. Machine learning screened eight candidate diagnostic genes and further validated them to identify ABCA8, COL4A1, FAP, LY6E, MAMDC2, and TMEM100 as diagnostic genes. Six diagnostic genes were mutated to some extent in GC. ABCA8, COL4A1, LY6E, MAMDC2, TMEM100 had prognostic value. Conclusion: We screened six diagnostic genes for gastric cancer through bioinformatic analysis and machine learning, which are intimately related to immune cell infiltration and have a definite prognostic value. Frontiers Media S.A. 2023-01-04 /pmc/articles/PMC9845288/ /pubmed/36685898 http://dx.doi.org/10.3389/fgene.2022.1067524 Text en Copyright © 2023 Xie, Liu, Lu, He and Li. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Xie, Rongjun
Liu, Longfei
Lu, Xianzhou
He, Chengjian
Li, Guoxin
Identification of the diagnostic genes and immune cell infiltration characteristics of gastric cancer using bioinformatics analysis and machine learning
title Identification of the diagnostic genes and immune cell infiltration characteristics of gastric cancer using bioinformatics analysis and machine learning
title_full Identification of the diagnostic genes and immune cell infiltration characteristics of gastric cancer using bioinformatics analysis and machine learning
title_fullStr Identification of the diagnostic genes and immune cell infiltration characteristics of gastric cancer using bioinformatics analysis and machine learning
title_full_unstemmed Identification of the diagnostic genes and immune cell infiltration characteristics of gastric cancer using bioinformatics analysis and machine learning
title_short Identification of the diagnostic genes and immune cell infiltration characteristics of gastric cancer using bioinformatics analysis and machine learning
title_sort identification of the diagnostic genes and immune cell infiltration characteristics of gastric cancer using bioinformatics analysis and machine learning
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9845288/
https://www.ncbi.nlm.nih.gov/pubmed/36685898
http://dx.doi.org/10.3389/fgene.2022.1067524
work_keys_str_mv AT xierongjun identificationofthediagnosticgenesandimmunecellinfiltrationcharacteristicsofgastriccancerusingbioinformaticsanalysisandmachinelearning
AT liulongfei identificationofthediagnosticgenesandimmunecellinfiltrationcharacteristicsofgastriccancerusingbioinformaticsanalysisandmachinelearning
AT luxianzhou identificationofthediagnosticgenesandimmunecellinfiltrationcharacteristicsofgastriccancerusingbioinformaticsanalysisandmachinelearning
AT hechengjian identificationofthediagnosticgenesandimmunecellinfiltrationcharacteristicsofgastriccancerusingbioinformaticsanalysisandmachinelearning
AT liguoxin identificationofthediagnosticgenesandimmunecellinfiltrationcharacteristicsofgastriccancerusingbioinformaticsanalysisandmachinelearning