Cargando…
Construction of diagnostic and prognostic models based on gene signatures of nasopharyngeal carcinoma by machine learning methods
BACKGROUND: Diagnostic models based on gene signatures of nasopharyngeal carcinoma (NPC) were constructed by random forest (RF) and artificial neural network (ANN) algorithms. Least absolute shrinkage and selection operator (Lasso)-Cox regression was used to select and build prognostic models based...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
AME Publishing Company
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10248568/ https://www.ncbi.nlm.nih.gov/pubmed/37304552 http://dx.doi.org/10.21037/tcr-22-2700 |
_version_ | 1785055404725633024 |
---|---|
author | Wang, Yiren He, Yongcheng Duan, Xiaodong Pang, Haowen Zhou, Ping |
author_facet | Wang, Yiren He, Yongcheng Duan, Xiaodong Pang, Haowen Zhou, Ping |
author_sort | Wang, Yiren |
collection | PubMed |
description | BACKGROUND: Diagnostic models based on gene signatures of nasopharyngeal carcinoma (NPC) were constructed by random forest (RF) and artificial neural network (ANN) algorithms. Least absolute shrinkage and selection operator (Lasso)-Cox regression was used to select and build prognostic models based on gene signatures. This study contributes to the early diagnosis and treatment, prognosis, and molecular mechanisms associated with NPC. METHODS: Two gene expression datasets were downloaded from the Gene Expression Omnibus (GEO) database, and differentially expressed genes (DEGs) associated with NPC were identified by gene expression differential analysis. Subsequently, significant DEGs were identified by a RF algorithm. ANN were used to construct a diagnostic model for NPC. The performance of the diagnostic model was evaluated by area under the curve (AUC) values using a validation set. Lasso-Cox regression examined gene signatures associated with prognosis. Overall survival (OS) and disease-free survival (DFS) prediction models were constructed and validated from The Cancer Genome Atlas (TCGA) database and the International Cancer Genome Consortium (ICGC) database. RESULTS: A total of 582 DEGs associated with NPC were identified, and 14 significant genes were identified by the RF algorithm. A diagnostic model for NPC was successfully constructed using ANN, and the validity of the model was confirmed on the training set AUC =0.947 [95% confidence interval (CI): 0.911–0.969] and the validation set AUC =0.864 (95% CI: 0.828–0.901). The 24-gene signatures associated with prognosis were identified by Lasso-Cox regression, and prediction models for OS and DFS of NPC were constructed on the training set. Finally, the ability of the model was validated on the validation set. CONCLUSIONS: Several potential gene signatures associated with NPC were identified, and a high-performance predictive model for early diagnosis of NPC and a prognostic prediction model with robust performance were successfully developed. The results of this study provide valuable references for early diagnosis, screening, treatment and molecular mechanism research of NPC in the future. |
format | Online Article Text |
id | pubmed-10248568 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | AME Publishing Company |
record_format | MEDLINE/PubMed |
spelling | pubmed-102485682023-06-09 Construction of diagnostic and prognostic models based on gene signatures of nasopharyngeal carcinoma by machine learning methods Wang, Yiren He, Yongcheng Duan, Xiaodong Pang, Haowen Zhou, Ping Transl Cancer Res Original Article BACKGROUND: Diagnostic models based on gene signatures of nasopharyngeal carcinoma (NPC) were constructed by random forest (RF) and artificial neural network (ANN) algorithms. Least absolute shrinkage and selection operator (Lasso)-Cox regression was used to select and build prognostic models based on gene signatures. This study contributes to the early diagnosis and treatment, prognosis, and molecular mechanisms associated with NPC. METHODS: Two gene expression datasets were downloaded from the Gene Expression Omnibus (GEO) database, and differentially expressed genes (DEGs) associated with NPC were identified by gene expression differential analysis. Subsequently, significant DEGs were identified by a RF algorithm. ANN were used to construct a diagnostic model for NPC. The performance of the diagnostic model was evaluated by area under the curve (AUC) values using a validation set. Lasso-Cox regression examined gene signatures associated with prognosis. Overall survival (OS) and disease-free survival (DFS) prediction models were constructed and validated from The Cancer Genome Atlas (TCGA) database and the International Cancer Genome Consortium (ICGC) database. RESULTS: A total of 582 DEGs associated with NPC were identified, and 14 significant genes were identified by the RF algorithm. A diagnostic model for NPC was successfully constructed using ANN, and the validity of the model was confirmed on the training set AUC =0.947 [95% confidence interval (CI): 0.911–0.969] and the validation set AUC =0.864 (95% CI: 0.828–0.901). The 24-gene signatures associated with prognosis were identified by Lasso-Cox regression, and prediction models for OS and DFS of NPC were constructed on the training set. Finally, the ability of the model was validated on the validation set. CONCLUSIONS: Several potential gene signatures associated with NPC were identified, and a high-performance predictive model for early diagnosis of NPC and a prognostic prediction model with robust performance were successfully developed. The results of this study provide valuable references for early diagnosis, screening, treatment and molecular mechanism research of NPC in the future. AME Publishing Company 2023-04-10 2023-05-31 /pmc/articles/PMC10248568/ /pubmed/37304552 http://dx.doi.org/10.21037/tcr-22-2700 Text en 2023 Translational Cancer Research. All rights reserved. https://creativecommons.org/licenses/by-nc-nd/4.0/Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0 (https://creativecommons.org/licenses/by-nc-nd/4.0/) . |
spellingShingle | Original Article Wang, Yiren He, Yongcheng Duan, Xiaodong Pang, Haowen Zhou, Ping Construction of diagnostic and prognostic models based on gene signatures of nasopharyngeal carcinoma by machine learning methods |
title | Construction of diagnostic and prognostic models based on gene signatures of nasopharyngeal carcinoma by machine learning methods |
title_full | Construction of diagnostic and prognostic models based on gene signatures of nasopharyngeal carcinoma by machine learning methods |
title_fullStr | Construction of diagnostic and prognostic models based on gene signatures of nasopharyngeal carcinoma by machine learning methods |
title_full_unstemmed | Construction of diagnostic and prognostic models based on gene signatures of nasopharyngeal carcinoma by machine learning methods |
title_short | Construction of diagnostic and prognostic models based on gene signatures of nasopharyngeal carcinoma by machine learning methods |
title_sort | construction of diagnostic and prognostic models based on gene signatures of nasopharyngeal carcinoma by machine learning methods |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10248568/ https://www.ncbi.nlm.nih.gov/pubmed/37304552 http://dx.doi.org/10.21037/tcr-22-2700 |
work_keys_str_mv | AT wangyiren constructionofdiagnosticandprognosticmodelsbasedongenesignaturesofnasopharyngealcarcinomabymachinelearningmethods AT heyongcheng constructionofdiagnosticandprognosticmodelsbasedongenesignaturesofnasopharyngealcarcinomabymachinelearningmethods AT duanxiaodong constructionofdiagnosticandprognosticmodelsbasedongenesignaturesofnasopharyngealcarcinomabymachinelearningmethods AT panghaowen constructionofdiagnosticandprognosticmodelsbasedongenesignaturesofnasopharyngealcarcinomabymachinelearningmethods AT zhouping constructionofdiagnosticandprognosticmodelsbasedongenesignaturesofnasopharyngealcarcinomabymachinelearningmethods |