Cargando…
Exploration of predictive and prognostic alternative splicing signatures in lung adenocarcinoma using machine learning methods
BACKGROUND: Alternative splicing (AS) plays critical roles in generating protein diversity and complexity. Dysregulation of AS underlies the initiation and progression of tumors. Machine learning approaches have emerged as efficient tools to identify promising biomarkers. It is meaningful to explore...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7720605/ https://www.ncbi.nlm.nih.gov/pubmed/33287830 http://dx.doi.org/10.1186/s12967-020-02635-y |
_version_ | 1783619883473305600 |
---|---|
author | Cai, Qidong He, Boxue Zhang, Pengfei Zhao, Zhenyu Peng, Xiong Zhang, Yuqian Xie, Hui Wang, Xiang |
author_facet | Cai, Qidong He, Boxue Zhang, Pengfei Zhao, Zhenyu Peng, Xiong Zhang, Yuqian Xie, Hui Wang, Xiang |
author_sort | Cai, Qidong |
collection | PubMed |
description | BACKGROUND: Alternative splicing (AS) plays critical roles in generating protein diversity and complexity. Dysregulation of AS underlies the initiation and progression of tumors. Machine learning approaches have emerged as efficient tools to identify promising biomarkers. It is meaningful to explore pivotal AS events (ASEs) to deepen understanding and improve prognostic assessments of lung adenocarcinoma (LUAD) via machine learning algorithms. METHOD: RNA sequencing data and AS data were extracted from The Cancer Genome Atlas (TCGA) database and TCGA SpliceSeq database. Using several machine learning methods, we identified 24 pairs of LUAD-related ASEs implicated in splicing switches and a random forest-based classifiers for identifying lymph node metastasis (LNM) consisting of 12 ASEs. Furthermore, we identified key prognosis-related ASEs and established a 16-ASE-based prognostic model to predict overall survival for LUAD patients using Cox regression model, random survival forest analysis, and forward selection model. Bioinformatics analyses were also applied to identify underlying mechanisms and associated upstream splicing factors (SFs). RESULTS: Each pair of ASEs was spliced from the same parent gene, and exhibited perfect inverse intrapair correlation (correlation coefficient = − 1). The 12-ASE-based classifier showed robust ability to evaluate LNM status of LUAD patients with the area under the receiver operating characteristic (ROC) curve (AUC) more than 0.7 in fivefold cross-validation. The prognostic model performed well at 1, 3, 5, and 10 years in both the training cohort and internal test cohort. Univariate and multivariate Cox regression indicated the prognostic model could be used as an independent prognostic factor for patients with LUAD. Further analysis revealed correlations between the prognostic model and American Joint Committee on Cancer stage, T stage, N stage, and living status. The splicing network constructed of survival-related SFs and ASEs depicts regulatory relationships between them. CONCLUSION: In summary, our study provides insight into LUAD researches and managements based on these AS biomarkers. |
format | Online Article Text |
id | pubmed-7720605 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-77206052020-12-08 Exploration of predictive and prognostic alternative splicing signatures in lung adenocarcinoma using machine learning methods Cai, Qidong He, Boxue Zhang, Pengfei Zhao, Zhenyu Peng, Xiong Zhang, Yuqian Xie, Hui Wang, Xiang J Transl Med Research BACKGROUND: Alternative splicing (AS) plays critical roles in generating protein diversity and complexity. Dysregulation of AS underlies the initiation and progression of tumors. Machine learning approaches have emerged as efficient tools to identify promising biomarkers. It is meaningful to explore pivotal AS events (ASEs) to deepen understanding and improve prognostic assessments of lung adenocarcinoma (LUAD) via machine learning algorithms. METHOD: RNA sequencing data and AS data were extracted from The Cancer Genome Atlas (TCGA) database and TCGA SpliceSeq database. Using several machine learning methods, we identified 24 pairs of LUAD-related ASEs implicated in splicing switches and a random forest-based classifiers for identifying lymph node metastasis (LNM) consisting of 12 ASEs. Furthermore, we identified key prognosis-related ASEs and established a 16-ASE-based prognostic model to predict overall survival for LUAD patients using Cox regression model, random survival forest analysis, and forward selection model. Bioinformatics analyses were also applied to identify underlying mechanisms and associated upstream splicing factors (SFs). RESULTS: Each pair of ASEs was spliced from the same parent gene, and exhibited perfect inverse intrapair correlation (correlation coefficient = − 1). The 12-ASE-based classifier showed robust ability to evaluate LNM status of LUAD patients with the area under the receiver operating characteristic (ROC) curve (AUC) more than 0.7 in fivefold cross-validation. The prognostic model performed well at 1, 3, 5, and 10 years in both the training cohort and internal test cohort. Univariate and multivariate Cox regression indicated the prognostic model could be used as an independent prognostic factor for patients with LUAD. Further analysis revealed correlations between the prognostic model and American Joint Committee on Cancer stage, T stage, N stage, and living status. The splicing network constructed of survival-related SFs and ASEs depicts regulatory relationships between them. CONCLUSION: In summary, our study provides insight into LUAD researches and managements based on these AS biomarkers. BioMed Central 2020-12-07 /pmc/articles/PMC7720605/ /pubmed/33287830 http://dx.doi.org/10.1186/s12967-020-02635-y Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Cai, Qidong He, Boxue Zhang, Pengfei Zhao, Zhenyu Peng, Xiong Zhang, Yuqian Xie, Hui Wang, Xiang Exploration of predictive and prognostic alternative splicing signatures in lung adenocarcinoma using machine learning methods |
title | Exploration of predictive and prognostic alternative splicing signatures in lung adenocarcinoma using machine learning methods |
title_full | Exploration of predictive and prognostic alternative splicing signatures in lung adenocarcinoma using machine learning methods |
title_fullStr | Exploration of predictive and prognostic alternative splicing signatures in lung adenocarcinoma using machine learning methods |
title_full_unstemmed | Exploration of predictive and prognostic alternative splicing signatures in lung adenocarcinoma using machine learning methods |
title_short | Exploration of predictive and prognostic alternative splicing signatures in lung adenocarcinoma using machine learning methods |
title_sort | exploration of predictive and prognostic alternative splicing signatures in lung adenocarcinoma using machine learning methods |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7720605/ https://www.ncbi.nlm.nih.gov/pubmed/33287830 http://dx.doi.org/10.1186/s12967-020-02635-y |
work_keys_str_mv | AT caiqidong explorationofpredictiveandprognosticalternativesplicingsignaturesinlungadenocarcinomausingmachinelearningmethods AT heboxue explorationofpredictiveandprognosticalternativesplicingsignaturesinlungadenocarcinomausingmachinelearningmethods AT zhangpengfei explorationofpredictiveandprognosticalternativesplicingsignaturesinlungadenocarcinomausingmachinelearningmethods AT zhaozhenyu explorationofpredictiveandprognosticalternativesplicingsignaturesinlungadenocarcinomausingmachinelearningmethods AT pengxiong explorationofpredictiveandprognosticalternativesplicingsignaturesinlungadenocarcinomausingmachinelearningmethods AT zhangyuqian explorationofpredictiveandprognosticalternativesplicingsignaturesinlungadenocarcinomausingmachinelearningmethods AT xiehui explorationofpredictiveandprognosticalternativesplicingsignaturesinlungadenocarcinomausingmachinelearningmethods AT wangxiang explorationofpredictiveandprognosticalternativesplicingsignaturesinlungadenocarcinomausingmachinelearningmethods |