Cargando…
A Novel Deep Learning Method to Predict Lung Cancer Long-Term Survival With Biological Knowledge Incorporated Gene Expression Images and Clinical Data
Lung cancer is the leading cause of the cancer deaths. Therefore, predicting the survival status of lung cancer patients is of great value. However, the existing methods mainly depend on statistical machine learning (ML) algorithms. Moreover, they are not appropriate for high-dimensionality genomics...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8964372/ https://www.ncbi.nlm.nih.gov/pubmed/35368657 http://dx.doi.org/10.3389/fgene.2022.800853 |
_version_ | 1784678201715326976 |
---|---|
author | Wang, Shuo Zhang, Hao Liu, Zhen Liu, Yuanning |
author_facet | Wang, Shuo Zhang, Hao Liu, Zhen Liu, Yuanning |
author_sort | Wang, Shuo |
collection | PubMed |
description | Lung cancer is the leading cause of the cancer deaths. Therefore, predicting the survival status of lung cancer patients is of great value. However, the existing methods mainly depend on statistical machine learning (ML) algorithms. Moreover, they are not appropriate for high-dimensionality genomics data, and deep learning (DL), with strong high-dimensional data learning capability, can be used to predict lung cancer survival using genomics data. The Cancer Genome Atlas (TCGA) is a great database that contains many kinds of genomics data for 33 cancer types. With this enormous amount of data, researchers can analyze key factors related to cancer therapy. This paper proposes a novel method to predict lung cancer long-term survival using gene expression data from TCGA. Firstly, we select the most relevant genes to the target problem by the supervised feature selection method called mutual information selector. Secondly, we propose a method to convert gene expression data into two kinds of images with KEGG BRITE and KEGG Pathway data incorporated, so that we could make good use of the convolutional neural network (CNN) model to learn high-level features. Afterwards, we design a CNN-based DL model and added two kinds of clinical data to improve the performance, so that we finally got a multimodal DL model. The generalized experiments results indicated that our method performed much better than the ML models and unimodal DL models. Furthermore, we conduct survival analysis and observe that our model could better divide the samples into high-risk and low-risk groups. |
format | Online Article Text |
id | pubmed-8964372 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-89643722022-03-31 A Novel Deep Learning Method to Predict Lung Cancer Long-Term Survival With Biological Knowledge Incorporated Gene Expression Images and Clinical Data Wang, Shuo Zhang, Hao Liu, Zhen Liu, Yuanning Front Genet Genetics Lung cancer is the leading cause of the cancer deaths. Therefore, predicting the survival status of lung cancer patients is of great value. However, the existing methods mainly depend on statistical machine learning (ML) algorithms. Moreover, they are not appropriate for high-dimensionality genomics data, and deep learning (DL), with strong high-dimensional data learning capability, can be used to predict lung cancer survival using genomics data. The Cancer Genome Atlas (TCGA) is a great database that contains many kinds of genomics data for 33 cancer types. With this enormous amount of data, researchers can analyze key factors related to cancer therapy. This paper proposes a novel method to predict lung cancer long-term survival using gene expression data from TCGA. Firstly, we select the most relevant genes to the target problem by the supervised feature selection method called mutual information selector. Secondly, we propose a method to convert gene expression data into two kinds of images with KEGG BRITE and KEGG Pathway data incorporated, so that we could make good use of the convolutional neural network (CNN) model to learn high-level features. Afterwards, we design a CNN-based DL model and added two kinds of clinical data to improve the performance, so that we finally got a multimodal DL model. The generalized experiments results indicated that our method performed much better than the ML models and unimodal DL models. Furthermore, we conduct survival analysis and observe that our model could better divide the samples into high-risk and low-risk groups. Frontiers Media S.A. 2022-03-14 /pmc/articles/PMC8964372/ /pubmed/35368657 http://dx.doi.org/10.3389/fgene.2022.800853 Text en Copyright © 2022 Wang, Zhang, Liu and Liu. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Wang, Shuo Zhang, Hao Liu, Zhen Liu, Yuanning A Novel Deep Learning Method to Predict Lung Cancer Long-Term Survival With Biological Knowledge Incorporated Gene Expression Images and Clinical Data |
title | A Novel Deep Learning Method to Predict Lung Cancer Long-Term Survival With Biological Knowledge Incorporated Gene Expression Images and Clinical Data |
title_full | A Novel Deep Learning Method to Predict Lung Cancer Long-Term Survival With Biological Knowledge Incorporated Gene Expression Images and Clinical Data |
title_fullStr | A Novel Deep Learning Method to Predict Lung Cancer Long-Term Survival With Biological Knowledge Incorporated Gene Expression Images and Clinical Data |
title_full_unstemmed | A Novel Deep Learning Method to Predict Lung Cancer Long-Term Survival With Biological Knowledge Incorporated Gene Expression Images and Clinical Data |
title_short | A Novel Deep Learning Method to Predict Lung Cancer Long-Term Survival With Biological Knowledge Incorporated Gene Expression Images and Clinical Data |
title_sort | novel deep learning method to predict lung cancer long-term survival with biological knowledge incorporated gene expression images and clinical data |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8964372/ https://www.ncbi.nlm.nih.gov/pubmed/35368657 http://dx.doi.org/10.3389/fgene.2022.800853 |
work_keys_str_mv | AT wangshuo anoveldeeplearningmethodtopredictlungcancerlongtermsurvivalwithbiologicalknowledgeincorporatedgeneexpressionimagesandclinicaldata AT zhanghao anoveldeeplearningmethodtopredictlungcancerlongtermsurvivalwithbiologicalknowledgeincorporatedgeneexpressionimagesandclinicaldata AT liuzhen anoveldeeplearningmethodtopredictlungcancerlongtermsurvivalwithbiologicalknowledgeincorporatedgeneexpressionimagesandclinicaldata AT liuyuanning anoveldeeplearningmethodtopredictlungcancerlongtermsurvivalwithbiologicalknowledgeincorporatedgeneexpressionimagesandclinicaldata AT wangshuo noveldeeplearningmethodtopredictlungcancerlongtermsurvivalwithbiologicalknowledgeincorporatedgeneexpressionimagesandclinicaldata AT zhanghao noveldeeplearningmethodtopredictlungcancerlongtermsurvivalwithbiologicalknowledgeincorporatedgeneexpressionimagesandclinicaldata AT liuzhen noveldeeplearningmethodtopredictlungcancerlongtermsurvivalwithbiologicalknowledgeincorporatedgeneexpressionimagesandclinicaldata AT liuyuanning noveldeeplearningmethodtopredictlungcancerlongtermsurvivalwithbiologicalknowledgeincorporatedgeneexpressionimagesandclinicaldata |