Cargando…

Construction and Validation of Early Warning Model of Lung Cancer Based on Machine Learning: A Retrospective Study

Background: This study is a retrospective study. The purpose of this study is to construct and validate an early warning model of lung cancer through machine learning. Methods: The CDKN2A gene expression profile and clinical information were downloaded from The Cancer Genome Atlas (TCGA) database an...

Descripción completa

Detalles Bibliográficos
Autores principales: Ye, Siyu, Pan, Jiongwei, Ye, Zaiting, Cao, Zhuo, Cai, Xiaoping, Zheng, Hao, Ye, Hong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: SAGE Publications 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9676307/
https://www.ncbi.nlm.nih.gov/pubmed/36380607
http://dx.doi.org/10.1177/15330338221136724
_version_ 1784833566213931008
author Ye, Siyu
Pan, Jiongwei
Ye, Zaiting
Cao, Zhuo
Cai, Xiaoping
Zheng, Hao
Ye, Hong
author_facet Ye, Siyu
Pan, Jiongwei
Ye, Zaiting
Cao, Zhuo
Cai, Xiaoping
Zheng, Hao
Ye, Hong
author_sort Ye, Siyu
collection PubMed
description Background: This study is a retrospective study. The purpose of this study is to construct and validate an early warning model of lung cancer through machine learning. Methods: The CDKN2A gene expression profile and clinical information were downloaded from The Cancer Genome Atlas (TCGA) database and divided into a tumor group and a normal group (n = 57). The top 5 somatic mutation-related genes were extracted from 567 somatic mutation data downloaded from TCGA database using random forest algorithm. Cox proportional hazard model and nomogram were constructed combining CDKN2A, 5 somatic mutation-related genes, gender, and smoking index. Patients were divided into high-risk and low-risk groups according to risk score. The predictability of the model in the prognosis of lung cancer was estimated by Kaplan–Meier survival analysis and receiver operating characteristics curve. Results: We constructed a prognostic model consisting of 5 somatic mutation-related genes (sphingosine 1-phosphate receptor 1 [S1PR1], dedicator of cytokinesis 7 [DOCK7], DEAD-box helicase 4 [DDX4], laminin subunit beta 3 [LAMB3], and importin 5 [IPO5]), cyclin-dependent kinase inhibitor 2A (CDKN2A), gender, and smoking indicators. The high-risk group had a lower overall survival rate compared to the low-risk group (hazard ratio = 2.14, P = 0 .0323). The area under the curve predicted for 3-year, 5-year, and 10-year survival rates are 0.609, 0.673, and 0.698, respectively. The accuracy, sensitivity, and specificity of the model for predicting the 10-year survival rate of lung cancer are 76.19%, 56.71%, and 86.23%. Conclusion: The lung cancer early warning model and nomogram may provide an essential reference for patients with lung cancer management in the clinic.
format Online
Article
Text
id pubmed-9676307
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher SAGE Publications
record_format MEDLINE/PubMed
spelling pubmed-96763072022-11-22 Construction and Validation of Early Warning Model of Lung Cancer Based on Machine Learning: A Retrospective Study Ye, Siyu Pan, Jiongwei Ye, Zaiting Cao, Zhuo Cai, Xiaoping Zheng, Hao Ye, Hong Technol Cancer Res Treat Original Article Background: This study is a retrospective study. The purpose of this study is to construct and validate an early warning model of lung cancer through machine learning. Methods: The CDKN2A gene expression profile and clinical information were downloaded from The Cancer Genome Atlas (TCGA) database and divided into a tumor group and a normal group (n = 57). The top 5 somatic mutation-related genes were extracted from 567 somatic mutation data downloaded from TCGA database using random forest algorithm. Cox proportional hazard model and nomogram were constructed combining CDKN2A, 5 somatic mutation-related genes, gender, and smoking index. Patients were divided into high-risk and low-risk groups according to risk score. The predictability of the model in the prognosis of lung cancer was estimated by Kaplan–Meier survival analysis and receiver operating characteristics curve. Results: We constructed a prognostic model consisting of 5 somatic mutation-related genes (sphingosine 1-phosphate receptor 1 [S1PR1], dedicator of cytokinesis 7 [DOCK7], DEAD-box helicase 4 [DDX4], laminin subunit beta 3 [LAMB3], and importin 5 [IPO5]), cyclin-dependent kinase inhibitor 2A (CDKN2A), gender, and smoking indicators. The high-risk group had a lower overall survival rate compared to the low-risk group (hazard ratio = 2.14, P = 0 .0323). The area under the curve predicted for 3-year, 5-year, and 10-year survival rates are 0.609, 0.673, and 0.698, respectively. The accuracy, sensitivity, and specificity of the model for predicting the 10-year survival rate of lung cancer are 76.19%, 56.71%, and 86.23%. Conclusion: The lung cancer early warning model and nomogram may provide an essential reference for patients with lung cancer management in the clinic. SAGE Publications 2022-11-15 /pmc/articles/PMC9676307/ /pubmed/36380607 http://dx.doi.org/10.1177/15330338221136724 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by-nc/4.0/This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (https://creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access page (https://us.sagepub.com/en-us/nam/open-access-at-sage).
spellingShingle Original Article
Ye, Siyu
Pan, Jiongwei
Ye, Zaiting
Cao, Zhuo
Cai, Xiaoping
Zheng, Hao
Ye, Hong
Construction and Validation of Early Warning Model of Lung Cancer Based on Machine Learning: A Retrospective Study
title Construction and Validation of Early Warning Model of Lung Cancer Based on Machine Learning: A Retrospective Study
title_full Construction and Validation of Early Warning Model of Lung Cancer Based on Machine Learning: A Retrospective Study
title_fullStr Construction and Validation of Early Warning Model of Lung Cancer Based on Machine Learning: A Retrospective Study
title_full_unstemmed Construction and Validation of Early Warning Model of Lung Cancer Based on Machine Learning: A Retrospective Study
title_short Construction and Validation of Early Warning Model of Lung Cancer Based on Machine Learning: A Retrospective Study
title_sort construction and validation of early warning model of lung cancer based on machine learning: a retrospective study
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9676307/
https://www.ncbi.nlm.nih.gov/pubmed/36380607
http://dx.doi.org/10.1177/15330338221136724
work_keys_str_mv AT yesiyu constructionandvalidationofearlywarningmodeloflungcancerbasedonmachinelearningaretrospectivestudy
AT panjiongwei constructionandvalidationofearlywarningmodeloflungcancerbasedonmachinelearningaretrospectivestudy
AT yezaiting constructionandvalidationofearlywarningmodeloflungcancerbasedonmachinelearningaretrospectivestudy
AT caozhuo constructionandvalidationofearlywarningmodeloflungcancerbasedonmachinelearningaretrospectivestudy
AT caixiaoping constructionandvalidationofearlywarningmodeloflungcancerbasedonmachinelearningaretrospectivestudy
AT zhenghao constructionandvalidationofearlywarningmodeloflungcancerbasedonmachinelearningaretrospectivestudy
AT yehong constructionandvalidationofearlywarningmodeloflungcancerbasedonmachinelearningaretrospectivestudy