Cargando…

Deep multi-omics integration by learning correlation-maximizing representation identifies prognostically stratified cancer subtypes

MOTIVATION: Molecular subtyping by integrative modeling of multi-omics and clinical data can help the identification of robust and clinically actionable disease subgroups; an essential step in developing precision medicine approaches. RESULTS: We developed a novel outcome-guided molecular subgroupin...

Descripción completa

Detalles Bibliográficos
Autores principales: Ji, Yanrong, Dutta, Pratik, Davuluri, Ramana
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10328436/
https://www.ncbi.nlm.nih.gov/pubmed/37424943
http://dx.doi.org/10.1093/bioadv/vbad075
_version_ 1785069798441353216
author Ji, Yanrong
Dutta, Pratik
Davuluri, Ramana
author_facet Ji, Yanrong
Dutta, Pratik
Davuluri, Ramana
author_sort Ji, Yanrong
collection PubMed
description MOTIVATION: Molecular subtyping by integrative modeling of multi-omics and clinical data can help the identification of robust and clinically actionable disease subgroups; an essential step in developing precision medicine approaches. RESULTS: We developed a novel outcome-guided molecular subgrouping framework, called Deep Multi-Omics Integrative Subtyping by Maximizing Correlation (DeepMOIS-MC), for integrative learning from multi-omics data by maximizing correlation between all input -omics views. DeepMOIS-MC consists of two parts: clustering and classification. In the clustering part, the preprocessed high-dimensional multi-omics views are input into two-layer fully connected neural networks. The outputs of individual networks are subjected to Generalized Canonical Correlation Analysis loss to learn the shared representation. Next, the learned representation is filtered by a regression model to select features that are related to a covariate clinical variable, for example, a survival/outcome. The filtered features are used for clustering to determine the optimal cluster assignments. In the classification stage, the original feature matrix of one of the -omics view is scaled and discretized based on equal frequency binning, and then subjected to feature selection using RandomForest. Using these selected features, classification models (for example, XGBoost model) are built to predict the molecular subgroups that were identified at clustering stage. We applied DeepMOIS-MC on lung and liver cancers, using TCGA datasets. In comparative analysis, we found that DeepMOIS-MC outperformed traditional approaches in patient stratification. Finally, we validated the robustness and generalizability of the classification models on independent datasets. We anticipate that the DeepMOIS-MC can be adopted to many multi-omics integrative analyses tasks. AVAILABILITY AND IMPLEMENTATION: Source codes for PyTorch implementation of DGCCA and other DeepMOIS-MC modules are available at GitHub (https://github.com/duttaprat/DeepMOIS-MC). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online.
format Online
Article
Text
id pubmed-10328436
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-103284362023-07-08 Deep multi-omics integration by learning correlation-maximizing representation identifies prognostically stratified cancer subtypes Ji, Yanrong Dutta, Pratik Davuluri, Ramana Bioinform Adv Original Article MOTIVATION: Molecular subtyping by integrative modeling of multi-omics and clinical data can help the identification of robust and clinically actionable disease subgroups; an essential step in developing precision medicine approaches. RESULTS: We developed a novel outcome-guided molecular subgrouping framework, called Deep Multi-Omics Integrative Subtyping by Maximizing Correlation (DeepMOIS-MC), for integrative learning from multi-omics data by maximizing correlation between all input -omics views. DeepMOIS-MC consists of two parts: clustering and classification. In the clustering part, the preprocessed high-dimensional multi-omics views are input into two-layer fully connected neural networks. The outputs of individual networks are subjected to Generalized Canonical Correlation Analysis loss to learn the shared representation. Next, the learned representation is filtered by a regression model to select features that are related to a covariate clinical variable, for example, a survival/outcome. The filtered features are used for clustering to determine the optimal cluster assignments. In the classification stage, the original feature matrix of one of the -omics view is scaled and discretized based on equal frequency binning, and then subjected to feature selection using RandomForest. Using these selected features, classification models (for example, XGBoost model) are built to predict the molecular subgroups that were identified at clustering stage. We applied DeepMOIS-MC on lung and liver cancers, using TCGA datasets. In comparative analysis, we found that DeepMOIS-MC outperformed traditional approaches in patient stratification. Finally, we validated the robustness and generalizability of the classification models on independent datasets. We anticipate that the DeepMOIS-MC can be adopted to many multi-omics integrative analyses tasks. AVAILABILITY AND IMPLEMENTATION: Source codes for PyTorch implementation of DGCCA and other DeepMOIS-MC modules are available at GitHub (https://github.com/duttaprat/DeepMOIS-MC). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online. Oxford University Press 2023-06-21 /pmc/articles/PMC10328436/ /pubmed/37424943 http://dx.doi.org/10.1093/bioadv/vbad075 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Article
Ji, Yanrong
Dutta, Pratik
Davuluri, Ramana
Deep multi-omics integration by learning correlation-maximizing representation identifies prognostically stratified cancer subtypes
title Deep multi-omics integration by learning correlation-maximizing representation identifies prognostically stratified cancer subtypes
title_full Deep multi-omics integration by learning correlation-maximizing representation identifies prognostically stratified cancer subtypes
title_fullStr Deep multi-omics integration by learning correlation-maximizing representation identifies prognostically stratified cancer subtypes
title_full_unstemmed Deep multi-omics integration by learning correlation-maximizing representation identifies prognostically stratified cancer subtypes
title_short Deep multi-omics integration by learning correlation-maximizing representation identifies prognostically stratified cancer subtypes
title_sort deep multi-omics integration by learning correlation-maximizing representation identifies prognostically stratified cancer subtypes
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10328436/
https://www.ncbi.nlm.nih.gov/pubmed/37424943
http://dx.doi.org/10.1093/bioadv/vbad075
work_keys_str_mv AT jiyanrong deepmultiomicsintegrationbylearningcorrelationmaximizingrepresentationidentifiesprognosticallystratifiedcancersubtypes
AT duttapratik deepmultiomicsintegrationbylearningcorrelationmaximizingrepresentationidentifiesprognosticallystratifiedcancersubtypes
AT davuluriramana deepmultiomicsintegrationbylearningcorrelationmaximizingrepresentationidentifiesprognosticallystratifiedcancersubtypes