Cargando…

A Novel Deep Learning Method to Predict Lung Cancer Long-Term Survival With Biological Knowledge Incorporated Gene Expression Images and Clinical Data

Lung cancer is the leading cause of the cancer deaths. Therefore, predicting the survival status of lung cancer patients is of great value. However, the existing methods mainly depend on statistical machine learning (ML) algorithms. Moreover, they are not appropriate for high-dimensionality genomics...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Shuo, Zhang, Hao, Liu, Zhen, Liu, Yuanning
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8964372/
https://www.ncbi.nlm.nih.gov/pubmed/35368657
http://dx.doi.org/10.3389/fgene.2022.800853
_version_ 1784678201715326976
author Wang, Shuo
Zhang, Hao
Liu, Zhen
Liu, Yuanning
author_facet Wang, Shuo
Zhang, Hao
Liu, Zhen
Liu, Yuanning
author_sort Wang, Shuo
collection PubMed
description Lung cancer is the leading cause of the cancer deaths. Therefore, predicting the survival status of lung cancer patients is of great value. However, the existing methods mainly depend on statistical machine learning (ML) algorithms. Moreover, they are not appropriate for high-dimensionality genomics data, and deep learning (DL), with strong high-dimensional data learning capability, can be used to predict lung cancer survival using genomics data. The Cancer Genome Atlas (TCGA) is a great database that contains many kinds of genomics data for 33 cancer types. With this enormous amount of data, researchers can analyze key factors related to cancer therapy. This paper proposes a novel method to predict lung cancer long-term survival using gene expression data from TCGA. Firstly, we select the most relevant genes to the target problem by the supervised feature selection method called mutual information selector. Secondly, we propose a method to convert gene expression data into two kinds of images with KEGG BRITE and KEGG Pathway data incorporated, so that we could make good use of the convolutional neural network (CNN) model to learn high-level features. Afterwards, we design a CNN-based DL model and added two kinds of clinical data to improve the performance, so that we finally got a multimodal DL model. The generalized experiments results indicated that our method performed much better than the ML models and unimodal DL models. Furthermore, we conduct survival analysis and observe that our model could better divide the samples into high-risk and low-risk groups.
format Online
Article
Text
id pubmed-8964372
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-89643722022-03-31 A Novel Deep Learning Method to Predict Lung Cancer Long-Term Survival With Biological Knowledge Incorporated Gene Expression Images and Clinical Data Wang, Shuo Zhang, Hao Liu, Zhen Liu, Yuanning Front Genet Genetics Lung cancer is the leading cause of the cancer deaths. Therefore, predicting the survival status of lung cancer patients is of great value. However, the existing methods mainly depend on statistical machine learning (ML) algorithms. Moreover, they are not appropriate for high-dimensionality genomics data, and deep learning (DL), with strong high-dimensional data learning capability, can be used to predict lung cancer survival using genomics data. The Cancer Genome Atlas (TCGA) is a great database that contains many kinds of genomics data for 33 cancer types. With this enormous amount of data, researchers can analyze key factors related to cancer therapy. This paper proposes a novel method to predict lung cancer long-term survival using gene expression data from TCGA. Firstly, we select the most relevant genes to the target problem by the supervised feature selection method called mutual information selector. Secondly, we propose a method to convert gene expression data into two kinds of images with KEGG BRITE and KEGG Pathway data incorporated, so that we could make good use of the convolutional neural network (CNN) model to learn high-level features. Afterwards, we design a CNN-based DL model and added two kinds of clinical data to improve the performance, so that we finally got a multimodal DL model. The generalized experiments results indicated that our method performed much better than the ML models and unimodal DL models. Furthermore, we conduct survival analysis and observe that our model could better divide the samples into high-risk and low-risk groups. Frontiers Media S.A. 2022-03-14 /pmc/articles/PMC8964372/ /pubmed/35368657 http://dx.doi.org/10.3389/fgene.2022.800853 Text en Copyright © 2022 Wang, Zhang, Liu and Liu. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Wang, Shuo
Zhang, Hao
Liu, Zhen
Liu, Yuanning
A Novel Deep Learning Method to Predict Lung Cancer Long-Term Survival With Biological Knowledge Incorporated Gene Expression Images and Clinical Data
title A Novel Deep Learning Method to Predict Lung Cancer Long-Term Survival With Biological Knowledge Incorporated Gene Expression Images and Clinical Data
title_full A Novel Deep Learning Method to Predict Lung Cancer Long-Term Survival With Biological Knowledge Incorporated Gene Expression Images and Clinical Data
title_fullStr A Novel Deep Learning Method to Predict Lung Cancer Long-Term Survival With Biological Knowledge Incorporated Gene Expression Images and Clinical Data
title_full_unstemmed A Novel Deep Learning Method to Predict Lung Cancer Long-Term Survival With Biological Knowledge Incorporated Gene Expression Images and Clinical Data
title_short A Novel Deep Learning Method to Predict Lung Cancer Long-Term Survival With Biological Knowledge Incorporated Gene Expression Images and Clinical Data
title_sort novel deep learning method to predict lung cancer long-term survival with biological knowledge incorporated gene expression images and clinical data
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8964372/
https://www.ncbi.nlm.nih.gov/pubmed/35368657
http://dx.doi.org/10.3389/fgene.2022.800853
work_keys_str_mv AT wangshuo anoveldeeplearningmethodtopredictlungcancerlongtermsurvivalwithbiologicalknowledgeincorporatedgeneexpressionimagesandclinicaldata
AT zhanghao anoveldeeplearningmethodtopredictlungcancerlongtermsurvivalwithbiologicalknowledgeincorporatedgeneexpressionimagesandclinicaldata
AT liuzhen anoveldeeplearningmethodtopredictlungcancerlongtermsurvivalwithbiologicalknowledgeincorporatedgeneexpressionimagesandclinicaldata
AT liuyuanning anoveldeeplearningmethodtopredictlungcancerlongtermsurvivalwithbiologicalknowledgeincorporatedgeneexpressionimagesandclinicaldata
AT wangshuo noveldeeplearningmethodtopredictlungcancerlongtermsurvivalwithbiologicalknowledgeincorporatedgeneexpressionimagesandclinicaldata
AT zhanghao noveldeeplearningmethodtopredictlungcancerlongtermsurvivalwithbiologicalknowledgeincorporatedgeneexpressionimagesandclinicaldata
AT liuzhen noveldeeplearningmethodtopredictlungcancerlongtermsurvivalwithbiologicalknowledgeincorporatedgeneexpressionimagesandclinicaldata
AT liuyuanning noveldeeplearningmethodtopredictlungcancerlongtermsurvivalwithbiologicalknowledgeincorporatedgeneexpressionimagesandclinicaldata