Cargando…
Identification of the diagnostic genes and immune cell infiltration characteristics of gastric cancer using bioinformatics analysis and machine learning
Background: Finding reliable diagnostic markers for gastric cancer (GC) is important. This work uses machine learning (ML) to identify GC diagnostic genes and investigate their connection with immune cell infiltration. Methods: We downloaded eight GC-related datasets from GEO, TCGA, and GTEx. GSE139...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9845288/ https://www.ncbi.nlm.nih.gov/pubmed/36685898 http://dx.doi.org/10.3389/fgene.2022.1067524 |
_version_ | 1784870867385188352 |
---|---|
author | Xie, Rongjun Liu, Longfei Lu, Xianzhou He, Chengjian Li, Guoxin |
author_facet | Xie, Rongjun Liu, Longfei Lu, Xianzhou He, Chengjian Li, Guoxin |
author_sort | Xie, Rongjun |
collection | PubMed |
description | Background: Finding reliable diagnostic markers for gastric cancer (GC) is important. This work uses machine learning (ML) to identify GC diagnostic genes and investigate their connection with immune cell infiltration. Methods: We downloaded eight GC-related datasets from GEO, TCGA, and GTEx. GSE13911, GSE15459, GSE19826, GSE54129, and GSE79973 were used as the training set, GSE66229 as the validation set A, and TCGA & GTEx as the validation set B. First, the training set screened differentially expressed genes (DEGs), and gene ontology (GO), kyoto encyclopedia of genes and genomes (KEGG), disease Ontology (DO), and gene set enrichment analysis (GSEA) analyses were performed. Then, the candidate diagnostic genes were screened by LASSO and SVM-RFE algorithms, and receiver operating characteristic (ROC) curves evaluated the diagnostic efficacy. Then, the infiltration characteristics of immune cells in GC samples were analyzed by CIBERSORT, and correlation analysis was performed. Finally, mutation and survival analyses were performed for diagnostic genes. Results: We found 207 up-regulated genes and 349 down-regulated genes among 556 DEGs. gene ontology analysis significantly enriched 413 functional annotations, including 310 biological processes, 23 cellular components, and 80 molecular functions. Six of these biological processes are closely related to immunity. KEGG analysis significantly enriched 11 signaling pathways. 244 diseases were closely related to Ontology analysis. Multiple entries of the gene set enrichment analysis analysis were closely related to immunity. Machine learning screened eight candidate diagnostic genes and further validated them to identify ABCA8, COL4A1, FAP, LY6E, MAMDC2, and TMEM100 as diagnostic genes. Six diagnostic genes were mutated to some extent in GC. ABCA8, COL4A1, LY6E, MAMDC2, TMEM100 had prognostic value. Conclusion: We screened six diagnostic genes for gastric cancer through bioinformatic analysis and machine learning, which are intimately related to immune cell infiltration and have a definite prognostic value. |
format | Online Article Text |
id | pubmed-9845288 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-98452882023-01-19 Identification of the diagnostic genes and immune cell infiltration characteristics of gastric cancer using bioinformatics analysis and machine learning Xie, Rongjun Liu, Longfei Lu, Xianzhou He, Chengjian Li, Guoxin Front Genet Genetics Background: Finding reliable diagnostic markers for gastric cancer (GC) is important. This work uses machine learning (ML) to identify GC diagnostic genes and investigate their connection with immune cell infiltration. Methods: We downloaded eight GC-related datasets from GEO, TCGA, and GTEx. GSE13911, GSE15459, GSE19826, GSE54129, and GSE79973 were used as the training set, GSE66229 as the validation set A, and TCGA & GTEx as the validation set B. First, the training set screened differentially expressed genes (DEGs), and gene ontology (GO), kyoto encyclopedia of genes and genomes (KEGG), disease Ontology (DO), and gene set enrichment analysis (GSEA) analyses were performed. Then, the candidate diagnostic genes were screened by LASSO and SVM-RFE algorithms, and receiver operating characteristic (ROC) curves evaluated the diagnostic efficacy. Then, the infiltration characteristics of immune cells in GC samples were analyzed by CIBERSORT, and correlation analysis was performed. Finally, mutation and survival analyses were performed for diagnostic genes. Results: We found 207 up-regulated genes and 349 down-regulated genes among 556 DEGs. gene ontology analysis significantly enriched 413 functional annotations, including 310 biological processes, 23 cellular components, and 80 molecular functions. Six of these biological processes are closely related to immunity. KEGG analysis significantly enriched 11 signaling pathways. 244 diseases were closely related to Ontology analysis. Multiple entries of the gene set enrichment analysis analysis were closely related to immunity. Machine learning screened eight candidate diagnostic genes and further validated them to identify ABCA8, COL4A1, FAP, LY6E, MAMDC2, and TMEM100 as diagnostic genes. Six diagnostic genes were mutated to some extent in GC. ABCA8, COL4A1, LY6E, MAMDC2, TMEM100 had prognostic value. Conclusion: We screened six diagnostic genes for gastric cancer through bioinformatic analysis and machine learning, which are intimately related to immune cell infiltration and have a definite prognostic value. Frontiers Media S.A. 2023-01-04 /pmc/articles/PMC9845288/ /pubmed/36685898 http://dx.doi.org/10.3389/fgene.2022.1067524 Text en Copyright © 2023 Xie, Liu, Lu, He and Li. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Xie, Rongjun Liu, Longfei Lu, Xianzhou He, Chengjian Li, Guoxin Identification of the diagnostic genes and immune cell infiltration characteristics of gastric cancer using bioinformatics analysis and machine learning |
title | Identification of the diagnostic genes and immune cell infiltration characteristics of gastric cancer using bioinformatics analysis and machine learning |
title_full | Identification of the diagnostic genes and immune cell infiltration characteristics of gastric cancer using bioinformatics analysis and machine learning |
title_fullStr | Identification of the diagnostic genes and immune cell infiltration characteristics of gastric cancer using bioinformatics analysis and machine learning |
title_full_unstemmed | Identification of the diagnostic genes and immune cell infiltration characteristics of gastric cancer using bioinformatics analysis and machine learning |
title_short | Identification of the diagnostic genes and immune cell infiltration characteristics of gastric cancer using bioinformatics analysis and machine learning |
title_sort | identification of the diagnostic genes and immune cell infiltration characteristics of gastric cancer using bioinformatics analysis and machine learning |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9845288/ https://www.ncbi.nlm.nih.gov/pubmed/36685898 http://dx.doi.org/10.3389/fgene.2022.1067524 |
work_keys_str_mv | AT xierongjun identificationofthediagnosticgenesandimmunecellinfiltrationcharacteristicsofgastriccancerusingbioinformaticsanalysisandmachinelearning AT liulongfei identificationofthediagnosticgenesandimmunecellinfiltrationcharacteristicsofgastriccancerusingbioinformaticsanalysisandmachinelearning AT luxianzhou identificationofthediagnosticgenesandimmunecellinfiltrationcharacteristicsofgastriccancerusingbioinformaticsanalysisandmachinelearning AT hechengjian identificationofthediagnosticgenesandimmunecellinfiltrationcharacteristicsofgastriccancerusingbioinformaticsanalysisandmachinelearning AT liguoxin identificationofthediagnosticgenesandimmunecellinfiltrationcharacteristicsofgastriccancerusingbioinformaticsanalysisandmachinelearning |