Cargando…

Prediction of survival and recurrence in patients with pancreatic cancer by integrating multi-omics data

Predicting the prognosis of pancreatic cancer is important because of the very low survival rates of patients with this particular cancer. Although several studies have used microRNA and gene expression profiles and clinical data, as well as images of tissues and cells, to predict cancer survival an...

Descripción completa

Detalles Bibliográficos
Autores principales: Baek, Bin, Lee, Hyunju
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7609582/
https://www.ncbi.nlm.nih.gov/pubmed/33144687
http://dx.doi.org/10.1038/s41598-020-76025-1
_version_ 1783605050495467520
author Baek, Bin
Lee, Hyunju
author_facet Baek, Bin
Lee, Hyunju
author_sort Baek, Bin
collection PubMed
description Predicting the prognosis of pancreatic cancer is important because of the very low survival rates of patients with this particular cancer. Although several studies have used microRNA and gene expression profiles and clinical data, as well as images of tissues and cells, to predict cancer survival and recurrence, the accuracies of these approaches in the prediction of high-risk pancreatic adenocarcinoma (PAAD) still need to be improved. Accordingly, in this study, we proposed two biological features based on multi-omics datasets to predict survival and recurrence among patients with PAAD. First, the clonal expansion of cancer cells with somatic mutations was used to predict prognosis. Using whole-exome sequencing data from 134 patients with PAAD from The Cancer Genome Atlas (TCGA), we found five candidate genes that were mutated in the early stages of tumorigenesis with high cellular prevalence (CP). CDKN2A, TP53, TTN, KCNJ18, and KRAS had the highest CP values among the patients with PAAD, and survival and recurrence rates were significantly different between the patients harboring mutations in these candidate genes and those harboring mutations in other genes (p = 2.39E−03, p = 8.47E−04, respectively). Second, we generated an autoencoder to integrate the RNA sequencing, microRNA sequencing, and DNA methylation data from 134 patients with PAAD from TCGA. The autoencoder robustly reduced the dimensions of these multi-omics data, and the K-means clustering method was then used to cluster the patients into two subgroups. The subgroups of patients had significant differences in survival and recurrence (p = 1.41E−03, p = 4.43E−04, respectively). Finally, we developed a prediction model for prognosis using these two biological features and clinical data. When support vector machines, random forest, logistic regression, and L2 regularized logistic regression were used as prediction models, logistic regression analysis generally revealed the best performance for both disease-free survival (DFS) and overall survival (OS) (accuracy [ACC] = 0.762 and area under the curve [AUC] = 0.795 for DFS; ACC = 0.776 and AUC = 0.769 for OS). Thus, we could classify patients with a high probability of recurrence and at a high risk of poor outcomes. Our study provides insights into new personalized therapies on the basis of mutation status and multi-omics data.
format Online
Article
Text
id pubmed-7609582
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-76095822020-11-05 Prediction of survival and recurrence in patients with pancreatic cancer by integrating multi-omics data Baek, Bin Lee, Hyunju Sci Rep Article Predicting the prognosis of pancreatic cancer is important because of the very low survival rates of patients with this particular cancer. Although several studies have used microRNA and gene expression profiles and clinical data, as well as images of tissues and cells, to predict cancer survival and recurrence, the accuracies of these approaches in the prediction of high-risk pancreatic adenocarcinoma (PAAD) still need to be improved. Accordingly, in this study, we proposed two biological features based on multi-omics datasets to predict survival and recurrence among patients with PAAD. First, the clonal expansion of cancer cells with somatic mutations was used to predict prognosis. Using whole-exome sequencing data from 134 patients with PAAD from The Cancer Genome Atlas (TCGA), we found five candidate genes that were mutated in the early stages of tumorigenesis with high cellular prevalence (CP). CDKN2A, TP53, TTN, KCNJ18, and KRAS had the highest CP values among the patients with PAAD, and survival and recurrence rates were significantly different between the patients harboring mutations in these candidate genes and those harboring mutations in other genes (p = 2.39E−03, p = 8.47E−04, respectively). Second, we generated an autoencoder to integrate the RNA sequencing, microRNA sequencing, and DNA methylation data from 134 patients with PAAD from TCGA. The autoencoder robustly reduced the dimensions of these multi-omics data, and the K-means clustering method was then used to cluster the patients into two subgroups. The subgroups of patients had significant differences in survival and recurrence (p = 1.41E−03, p = 4.43E−04, respectively). Finally, we developed a prediction model for prognosis using these two biological features and clinical data. When support vector machines, random forest, logistic regression, and L2 regularized logistic regression were used as prediction models, logistic regression analysis generally revealed the best performance for both disease-free survival (DFS) and overall survival (OS) (accuracy [ACC] = 0.762 and area under the curve [AUC] = 0.795 for DFS; ACC = 0.776 and AUC = 0.769 for OS). Thus, we could classify patients with a high probability of recurrence and at a high risk of poor outcomes. Our study provides insights into new personalized therapies on the basis of mutation status and multi-omics data. Nature Publishing Group UK 2020-11-03 /pmc/articles/PMC7609582/ /pubmed/33144687 http://dx.doi.org/10.1038/s41598-020-76025-1 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Baek, Bin
Lee, Hyunju
Prediction of survival and recurrence in patients with pancreatic cancer by integrating multi-omics data
title Prediction of survival and recurrence in patients with pancreatic cancer by integrating multi-omics data
title_full Prediction of survival and recurrence in patients with pancreatic cancer by integrating multi-omics data
title_fullStr Prediction of survival and recurrence in patients with pancreatic cancer by integrating multi-omics data
title_full_unstemmed Prediction of survival and recurrence in patients with pancreatic cancer by integrating multi-omics data
title_short Prediction of survival and recurrence in patients with pancreatic cancer by integrating multi-omics data
title_sort prediction of survival and recurrence in patients with pancreatic cancer by integrating multi-omics data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7609582/
https://www.ncbi.nlm.nih.gov/pubmed/33144687
http://dx.doi.org/10.1038/s41598-020-76025-1
work_keys_str_mv AT baekbin predictionofsurvivalandrecurrenceinpatientswithpancreaticcancerbyintegratingmultiomicsdata
AT leehyunju predictionofsurvivalandrecurrenceinpatientswithpancreaticcancerbyintegratingmultiomicsdata