Cargando…
Recurrence prediction of lung adenocarcinoma using an immune gene expression and clinical data trained and validated support vector machine classifier
BACKGROUND: Immune microenvironment plays a critical role in cancer from onset to relapse. Machine learning (ML) algorithm can facilitate the analysis of lab and clinical data to predict lung cancer recurrence. Prompt detection and intervention are crucial for long-term survival in lung cancer relap...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
AME Publishing Company
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10654435/ https://www.ncbi.nlm.nih.gov/pubmed/38025809 http://dx.doi.org/10.21037/tlcr-23-473 |
_version_ | 1785136623030108160 |
---|---|
author | Shen, Yingran Goparaju, Chandra Yang, Yang Babu, Benson A. Gai, Weiming Pass, Harvey Jiang, Gening |
author_facet | Shen, Yingran Goparaju, Chandra Yang, Yang Babu, Benson A. Gai, Weiming Pass, Harvey Jiang, Gening |
author_sort | Shen, Yingran |
collection | PubMed |
description | BACKGROUND: Immune microenvironment plays a critical role in cancer from onset to relapse. Machine learning (ML) algorithm can facilitate the analysis of lab and clinical data to predict lung cancer recurrence. Prompt detection and intervention are crucial for long-term survival in lung cancer relapse. Our study aimed to evaluate the clinical and genomic prognosticators for lung cancer recurrence by comparing the predictive accuracy of four ML models. METHODS: A total of 41 early-stage lung cancer patients who underwent surgery between June 2007 and October 2014 at New York University Langone Medical Center were included (with recurrence, n=16; without recurrence, n=25). All patients had tumor tissue and buffy coat collected at the time of resection. The CIBERSORT algorithm quantified tumor-infiltrating immune cells (TIICs). Protein-protein interaction (PPI) network and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis were conducted to unearth potential molecular drivers of tumor progression. The data was split into training (75%) and validation sets (25%). Ensemble linear kernel support vector machine (SVM) ML models were developed using optimized clinical and genomic features to predict tumor recurrence. RESULTS: Activated natural killer (NK) cells, M0 macrophages, and M1 macrophages showed a positive correlation with progression. Conversely, T CD4(+) memory resting cells were negatively correlated. In the PPI network, TNF and IL6 emerged as prominent hub genes. Prediction models integrating clinicopathological prognostic factors, tumor gene expression (45 genes), and buffy coat gene expression (47 genes) yielded varying receiver operating characteristic (ROC)-area under the curves (AUCs): 62.7%, 65.4%, and 59.7% in the training set, 58.3%, 83.3%, and 75.0% in the validation set, respectively. Notably, merging gene expression with clinical data in a linear SVM model led to a significant accuracy boost, with an AUC of 92.0% in training and 91.7% in validation. CONCLUSIONS: Using ML algorithm, immune gene expression data from tumor tissue and buffy coat may enhance the precision of lung cancer recurrence prediction. |
format | Online Article Text |
id | pubmed-10654435 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | AME Publishing Company |
record_format | MEDLINE/PubMed |
spelling | pubmed-106544352023-10-31 Recurrence prediction of lung adenocarcinoma using an immune gene expression and clinical data trained and validated support vector machine classifier Shen, Yingran Goparaju, Chandra Yang, Yang Babu, Benson A. Gai, Weiming Pass, Harvey Jiang, Gening Transl Lung Cancer Res Original Article BACKGROUND: Immune microenvironment plays a critical role in cancer from onset to relapse. Machine learning (ML) algorithm can facilitate the analysis of lab and clinical data to predict lung cancer recurrence. Prompt detection and intervention are crucial for long-term survival in lung cancer relapse. Our study aimed to evaluate the clinical and genomic prognosticators for lung cancer recurrence by comparing the predictive accuracy of four ML models. METHODS: A total of 41 early-stage lung cancer patients who underwent surgery between June 2007 and October 2014 at New York University Langone Medical Center were included (with recurrence, n=16; without recurrence, n=25). All patients had tumor tissue and buffy coat collected at the time of resection. The CIBERSORT algorithm quantified tumor-infiltrating immune cells (TIICs). Protein-protein interaction (PPI) network and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis were conducted to unearth potential molecular drivers of tumor progression. The data was split into training (75%) and validation sets (25%). Ensemble linear kernel support vector machine (SVM) ML models were developed using optimized clinical and genomic features to predict tumor recurrence. RESULTS: Activated natural killer (NK) cells, M0 macrophages, and M1 macrophages showed a positive correlation with progression. Conversely, T CD4(+) memory resting cells were negatively correlated. In the PPI network, TNF and IL6 emerged as prominent hub genes. Prediction models integrating clinicopathological prognostic factors, tumor gene expression (45 genes), and buffy coat gene expression (47 genes) yielded varying receiver operating characteristic (ROC)-area under the curves (AUCs): 62.7%, 65.4%, and 59.7% in the training set, 58.3%, 83.3%, and 75.0% in the validation set, respectively. Notably, merging gene expression with clinical data in a linear SVM model led to a significant accuracy boost, with an AUC of 92.0% in training and 91.7% in validation. CONCLUSIONS: Using ML algorithm, immune gene expression data from tumor tissue and buffy coat may enhance the precision of lung cancer recurrence prediction. AME Publishing Company 2023-10-27 2023-10-31 /pmc/articles/PMC10654435/ /pubmed/38025809 http://dx.doi.org/10.21037/tlcr-23-473 Text en 2023 Translational Lung Cancer Research. All rights reserved. https://creativecommons.org/licenses/by-nc-nd/4.0/Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0 (https://creativecommons.org/licenses/by-nc-nd/4.0/) . |
spellingShingle | Original Article Shen, Yingran Goparaju, Chandra Yang, Yang Babu, Benson A. Gai, Weiming Pass, Harvey Jiang, Gening Recurrence prediction of lung adenocarcinoma using an immune gene expression and clinical data trained and validated support vector machine classifier |
title | Recurrence prediction of lung adenocarcinoma using an immune gene expression and clinical data trained and validated support vector machine classifier |
title_full | Recurrence prediction of lung adenocarcinoma using an immune gene expression and clinical data trained and validated support vector machine classifier |
title_fullStr | Recurrence prediction of lung adenocarcinoma using an immune gene expression and clinical data trained and validated support vector machine classifier |
title_full_unstemmed | Recurrence prediction of lung adenocarcinoma using an immune gene expression and clinical data trained and validated support vector machine classifier |
title_short | Recurrence prediction of lung adenocarcinoma using an immune gene expression and clinical data trained and validated support vector machine classifier |
title_sort | recurrence prediction of lung adenocarcinoma using an immune gene expression and clinical data trained and validated support vector machine classifier |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10654435/ https://www.ncbi.nlm.nih.gov/pubmed/38025809 http://dx.doi.org/10.21037/tlcr-23-473 |
work_keys_str_mv | AT shenyingran recurrencepredictionoflungadenocarcinomausinganimmunegeneexpressionandclinicaldatatrainedandvalidatedsupportvectormachineclassifier AT goparajuchandra recurrencepredictionoflungadenocarcinomausinganimmunegeneexpressionandclinicaldatatrainedandvalidatedsupportvectormachineclassifier AT yangyang recurrencepredictionoflungadenocarcinomausinganimmunegeneexpressionandclinicaldatatrainedandvalidatedsupportvectormachineclassifier AT babubensona recurrencepredictionoflungadenocarcinomausinganimmunegeneexpressionandclinicaldatatrainedandvalidatedsupportvectormachineclassifier AT gaiweiming recurrencepredictionoflungadenocarcinomausinganimmunegeneexpressionandclinicaldatatrainedandvalidatedsupportvectormachineclassifier AT passharvey recurrencepredictionoflungadenocarcinomausinganimmunegeneexpressionandclinicaldatatrainedandvalidatedsupportvectormachineclassifier AT jianggening recurrencepredictionoflungadenocarcinomausinganimmunegeneexpressionandclinicaldatatrainedandvalidatedsupportvectormachineclassifier |