Cargando…
Developing prognostic gene panel of survival time in lung adenocarcinoma patients using machine learning
BACKGROUND: Transcriptome data generates massive amounts of information that can be used for characterization and prognosis of patient outcomes for many diseases. The goal of our research is to predict the survival time of lung adenocarcinoma patients and improve the accuracy of classifying the long...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
AME Publishing Company
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8799101/ https://www.ncbi.nlm.nih.gov/pubmed/35117753 http://dx.doi.org/10.21037/tcr-19-2739 |
_version_ | 1784641986939060224 |
---|---|
author | Liu, Yidi Yang, Mu Sun, Weiwei Zhang, Mingqiang Sun, Jiao Wang, Wenjuan Tang, Dongqi Yuan, Dongfeng |
author_facet | Liu, Yidi Yang, Mu Sun, Weiwei Zhang, Mingqiang Sun, Jiao Wang, Wenjuan Tang, Dongqi Yuan, Dongfeng |
author_sort | Liu, Yidi |
collection | PubMed |
description | BACKGROUND: Transcriptome data generates massive amounts of information that can be used for characterization and prognosis of patient outcomes for many diseases. The goal of our research is to predict the survival time of lung adenocarcinoma patients and improve the accuracy of classifying the long-survival cohort and short-survival cohort. METHODS: We filtered prognostic features related with survival time of lung adenocarcinoma patients by the method of Relief and predicted whether survival time of the patient is >3 years or not—using eight machine learning algorithms (Support Vector Machines, Random Forests, Logistic Regression, Naïve Bayes, Linear Regression, Support Vector Regression (kernel Poly), Support Vector Regression (kernel Linear), and Ridge Regression). Then the best-performed algorithm was chosen to build a predictive model of survival time of lung adenocarcinoma patients. Further, another dataset was used to verify the stability and suitability of this model. We explored the underlying mechanisms of RNA expression changes with the corresponding DNA mutations and DNA methylation patterns in the 22 selected genetic features. RESULTS: The best machine learning algorithm was Naïve Bayes (accuracy=75%, AUC =0.81) using the top 22 genetic features, and this algorithm had the stable and great performance on another dataset as well. The coupled mutation number of the long-survival group (>6 years) was less than the short-survival group (<1 year) in 22 genes (P=0.031). CONCLUSIONS: The expression of gene panel can predict the survival time of lung adenocarcinoma patients using Naïve Bayes. These 22 genes do affect the survival time of lung adenocarcinoma. |
format | Online Article Text |
id | pubmed-8799101 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | AME Publishing Company |
record_format | MEDLINE/PubMed |
spelling | pubmed-87991012022-02-02 Developing prognostic gene panel of survival time in lung adenocarcinoma patients using machine learning Liu, Yidi Yang, Mu Sun, Weiwei Zhang, Mingqiang Sun, Jiao Wang, Wenjuan Tang, Dongqi Yuan, Dongfeng Transl Cancer Res Original Article BACKGROUND: Transcriptome data generates massive amounts of information that can be used for characterization and prognosis of patient outcomes for many diseases. The goal of our research is to predict the survival time of lung adenocarcinoma patients and improve the accuracy of classifying the long-survival cohort and short-survival cohort. METHODS: We filtered prognostic features related with survival time of lung adenocarcinoma patients by the method of Relief and predicted whether survival time of the patient is >3 years or not—using eight machine learning algorithms (Support Vector Machines, Random Forests, Logistic Regression, Naïve Bayes, Linear Regression, Support Vector Regression (kernel Poly), Support Vector Regression (kernel Linear), and Ridge Regression). Then the best-performed algorithm was chosen to build a predictive model of survival time of lung adenocarcinoma patients. Further, another dataset was used to verify the stability and suitability of this model. We explored the underlying mechanisms of RNA expression changes with the corresponding DNA mutations and DNA methylation patterns in the 22 selected genetic features. RESULTS: The best machine learning algorithm was Naïve Bayes (accuracy=75%, AUC =0.81) using the top 22 genetic features, and this algorithm had the stable and great performance on another dataset as well. The coupled mutation number of the long-survival group (>6 years) was less than the short-survival group (<1 year) in 22 genes (P=0.031). CONCLUSIONS: The expression of gene panel can predict the survival time of lung adenocarcinoma patients using Naïve Bayes. These 22 genes do affect the survival time of lung adenocarcinoma. AME Publishing Company 2020-06 /pmc/articles/PMC8799101/ /pubmed/35117753 http://dx.doi.org/10.21037/tcr-19-2739 Text en 2020 Translational Cancer Research. All rights reserved. https://creativecommons.org/licenses/by-nc-nd/4.0/Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/. |
spellingShingle | Original Article Liu, Yidi Yang, Mu Sun, Weiwei Zhang, Mingqiang Sun, Jiao Wang, Wenjuan Tang, Dongqi Yuan, Dongfeng Developing prognostic gene panel of survival time in lung adenocarcinoma patients using machine learning |
title | Developing prognostic gene panel of survival time in lung adenocarcinoma patients using machine learning |
title_full | Developing prognostic gene panel of survival time in lung adenocarcinoma patients using machine learning |
title_fullStr | Developing prognostic gene panel of survival time in lung adenocarcinoma patients using machine learning |
title_full_unstemmed | Developing prognostic gene panel of survival time in lung adenocarcinoma patients using machine learning |
title_short | Developing prognostic gene panel of survival time in lung adenocarcinoma patients using machine learning |
title_sort | developing prognostic gene panel of survival time in lung adenocarcinoma patients using machine learning |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8799101/ https://www.ncbi.nlm.nih.gov/pubmed/35117753 http://dx.doi.org/10.21037/tcr-19-2739 |
work_keys_str_mv | AT liuyidi developingprognosticgenepanelofsurvivaltimeinlungadenocarcinomapatientsusingmachinelearning AT yangmu developingprognosticgenepanelofsurvivaltimeinlungadenocarcinomapatientsusingmachinelearning AT sunweiwei developingprognosticgenepanelofsurvivaltimeinlungadenocarcinomapatientsusingmachinelearning AT zhangmingqiang developingprognosticgenepanelofsurvivaltimeinlungadenocarcinomapatientsusingmachinelearning AT sunjiao developingprognosticgenepanelofsurvivaltimeinlungadenocarcinomapatientsusingmachinelearning AT wangwenjuan developingprognosticgenepanelofsurvivaltimeinlungadenocarcinomapatientsusingmachinelearning AT tangdongqi developingprognosticgenepanelofsurvivaltimeinlungadenocarcinomapatientsusingmachinelearning AT yuandongfeng developingprognosticgenepanelofsurvivaltimeinlungadenocarcinomapatientsusingmachinelearning |