Cargando…

Construction of diagnostic and prognostic models based on gene signatures of nasopharyngeal carcinoma by machine learning methods

BACKGROUND: Diagnostic models based on gene signatures of nasopharyngeal carcinoma (NPC) were constructed by random forest (RF) and artificial neural network (ANN) algorithms. Least absolute shrinkage and selection operator (Lasso)-Cox regression was used to select and build prognostic models based...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Yiren, He, Yongcheng, Duan, Xiaodong, Pang, Haowen, Zhou, Ping
Formato: Online Artículo Texto
Lenguaje:English
Publicado: AME Publishing Company 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10248568/
https://www.ncbi.nlm.nih.gov/pubmed/37304552
http://dx.doi.org/10.21037/tcr-22-2700
_version_ 1785055404725633024
author Wang, Yiren
He, Yongcheng
Duan, Xiaodong
Pang, Haowen
Zhou, Ping
author_facet Wang, Yiren
He, Yongcheng
Duan, Xiaodong
Pang, Haowen
Zhou, Ping
author_sort Wang, Yiren
collection PubMed
description BACKGROUND: Diagnostic models based on gene signatures of nasopharyngeal carcinoma (NPC) were constructed by random forest (RF) and artificial neural network (ANN) algorithms. Least absolute shrinkage and selection operator (Lasso)-Cox regression was used to select and build prognostic models based on gene signatures. This study contributes to the early diagnosis and treatment, prognosis, and molecular mechanisms associated with NPC. METHODS: Two gene expression datasets were downloaded from the Gene Expression Omnibus (GEO) database, and differentially expressed genes (DEGs) associated with NPC were identified by gene expression differential analysis. Subsequently, significant DEGs were identified by a RF algorithm. ANN were used to construct a diagnostic model for NPC. The performance of the diagnostic model was evaluated by area under the curve (AUC) values using a validation set. Lasso-Cox regression examined gene signatures associated with prognosis. Overall survival (OS) and disease-free survival (DFS) prediction models were constructed and validated from The Cancer Genome Atlas (TCGA) database and the International Cancer Genome Consortium (ICGC) database. RESULTS: A total of 582 DEGs associated with NPC were identified, and 14 significant genes were identified by the RF algorithm. A diagnostic model for NPC was successfully constructed using ANN, and the validity of the model was confirmed on the training set AUC =0.947 [95% confidence interval (CI): 0.911–0.969] and the validation set AUC =0.864 (95% CI: 0.828–0.901). The 24-gene signatures associated with prognosis were identified by Lasso-Cox regression, and prediction models for OS and DFS of NPC were constructed on the training set. Finally, the ability of the model was validated on the validation set. CONCLUSIONS: Several potential gene signatures associated with NPC were identified, and a high-performance predictive model for early diagnosis of NPC and a prognostic prediction model with robust performance were successfully developed. The results of this study provide valuable references for early diagnosis, screening, treatment and molecular mechanism research of NPC in the future.
format Online
Article
Text
id pubmed-10248568
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher AME Publishing Company
record_format MEDLINE/PubMed
spelling pubmed-102485682023-06-09 Construction of diagnostic and prognostic models based on gene signatures of nasopharyngeal carcinoma by machine learning methods Wang, Yiren He, Yongcheng Duan, Xiaodong Pang, Haowen Zhou, Ping Transl Cancer Res Original Article BACKGROUND: Diagnostic models based on gene signatures of nasopharyngeal carcinoma (NPC) were constructed by random forest (RF) and artificial neural network (ANN) algorithms. Least absolute shrinkage and selection operator (Lasso)-Cox regression was used to select and build prognostic models based on gene signatures. This study contributes to the early diagnosis and treatment, prognosis, and molecular mechanisms associated with NPC. METHODS: Two gene expression datasets were downloaded from the Gene Expression Omnibus (GEO) database, and differentially expressed genes (DEGs) associated with NPC were identified by gene expression differential analysis. Subsequently, significant DEGs were identified by a RF algorithm. ANN were used to construct a diagnostic model for NPC. The performance of the diagnostic model was evaluated by area under the curve (AUC) values using a validation set. Lasso-Cox regression examined gene signatures associated with prognosis. Overall survival (OS) and disease-free survival (DFS) prediction models were constructed and validated from The Cancer Genome Atlas (TCGA) database and the International Cancer Genome Consortium (ICGC) database. RESULTS: A total of 582 DEGs associated with NPC were identified, and 14 significant genes were identified by the RF algorithm. A diagnostic model for NPC was successfully constructed using ANN, and the validity of the model was confirmed on the training set AUC =0.947 [95% confidence interval (CI): 0.911–0.969] and the validation set AUC =0.864 (95% CI: 0.828–0.901). The 24-gene signatures associated with prognosis were identified by Lasso-Cox regression, and prediction models for OS and DFS of NPC were constructed on the training set. Finally, the ability of the model was validated on the validation set. CONCLUSIONS: Several potential gene signatures associated with NPC were identified, and a high-performance predictive model for early diagnosis of NPC and a prognostic prediction model with robust performance were successfully developed. The results of this study provide valuable references for early diagnosis, screening, treatment and molecular mechanism research of NPC in the future. AME Publishing Company 2023-04-10 2023-05-31 /pmc/articles/PMC10248568/ /pubmed/37304552 http://dx.doi.org/10.21037/tcr-22-2700 Text en 2023 Translational Cancer Research. All rights reserved. https://creativecommons.org/licenses/by-nc-nd/4.0/Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0 (https://creativecommons.org/licenses/by-nc-nd/4.0/) .
spellingShingle Original Article
Wang, Yiren
He, Yongcheng
Duan, Xiaodong
Pang, Haowen
Zhou, Ping
Construction of diagnostic and prognostic models based on gene signatures of nasopharyngeal carcinoma by machine learning methods
title Construction of diagnostic and prognostic models based on gene signatures of nasopharyngeal carcinoma by machine learning methods
title_full Construction of diagnostic and prognostic models based on gene signatures of nasopharyngeal carcinoma by machine learning methods
title_fullStr Construction of diagnostic and prognostic models based on gene signatures of nasopharyngeal carcinoma by machine learning methods
title_full_unstemmed Construction of diagnostic and prognostic models based on gene signatures of nasopharyngeal carcinoma by machine learning methods
title_short Construction of diagnostic and prognostic models based on gene signatures of nasopharyngeal carcinoma by machine learning methods
title_sort construction of diagnostic and prognostic models based on gene signatures of nasopharyngeal carcinoma by machine learning methods
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10248568/
https://www.ncbi.nlm.nih.gov/pubmed/37304552
http://dx.doi.org/10.21037/tcr-22-2700
work_keys_str_mv AT wangyiren constructionofdiagnosticandprognosticmodelsbasedongenesignaturesofnasopharyngealcarcinomabymachinelearningmethods
AT heyongcheng constructionofdiagnosticandprognosticmodelsbasedongenesignaturesofnasopharyngealcarcinomabymachinelearningmethods
AT duanxiaodong constructionofdiagnosticandprognosticmodelsbasedongenesignaturesofnasopharyngealcarcinomabymachinelearningmethods
AT panghaowen constructionofdiagnosticandprognosticmodelsbasedongenesignaturesofnasopharyngealcarcinomabymachinelearningmethods
AT zhouping constructionofdiagnosticandprognosticmodelsbasedongenesignaturesofnasopharyngealcarcinomabymachinelearningmethods