Cargando…

PathME: pathway based multi-modal sparse autoencoders for clustering of patient-level multi-omics data

BACKGROUND: Recent years have witnessed an increasing interest in multi-omics data, because these data allow for better understanding complex diseases such as cancer on a molecular system level. In addition, multi-omics data increase the chance to robustly identify molecular patient sub-groups and h...

Descripción completa

Detalles Bibliográficos
Autores principales: Lemsara, Amina, Ouadfel, Salima, Fröhlich, Holger
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7161108/
https://www.ncbi.nlm.nih.gov/pubmed/32299344
http://dx.doi.org/10.1186/s12859-020-3465-2
_version_ 1783522890675650560
author Lemsara, Amina
Ouadfel, Salima
Fröhlich, Holger
author_facet Lemsara, Amina
Ouadfel, Salima
Fröhlich, Holger
author_sort Lemsara, Amina
collection PubMed
description BACKGROUND: Recent years have witnessed an increasing interest in multi-omics data, because these data allow for better understanding complex diseases such as cancer on a molecular system level. In addition, multi-omics data increase the chance to robustly identify molecular patient sub-groups and hence open the door towards a better personalized treatment of diseases. Several methods have been proposed for unsupervised clustering of multi-omics data. However, a number of challenges remain, such as the magnitude of features and the large difference in dimensionality across different omics data sources. RESULTS: We propose a multi-modal sparse denoising autoencoder framework coupled with sparse non-negative matrix factorization to robustly cluster patients based on multi-omics data. The proposed model specifically leverages pathway information to effectively reduce the dimensionality of omics data into a pathway and patient specific score profile. In consequence, our method allows us to understand, which pathway is a feature of which particular patient cluster. Moreover, recently proposed machine learning techniques allow us to disentangle the specific impact of each individual omics feature on a pathway score. We applied our method to cluster patients in several cancer datasets using gene expression, miRNA expression, DNA methylation and CNVs, demonstrating the possibility to obtain biologically plausible disease subtypes characterized by specific molecular features. Comparison against several competing methods showed a competitive clustering performance. In addition, post-hoc analysis of somatic mutations and clinical data provided supporting evidence and interpretation of the identified clusters. CONCLUSIONS: Our suggested multi-modal sparse denoising autoencoder approach allows for an effective and interpretable integration of multi-omics data on pathway level while addressing the high dimensional character of omics data. Patient specific pathway score profiles derived from our model allow for a robust identification of disease subgroups.
format Online
Article
Text
id pubmed-7161108
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-71611082020-04-22 PathME: pathway based multi-modal sparse autoencoders for clustering of patient-level multi-omics data Lemsara, Amina Ouadfel, Salima Fröhlich, Holger BMC Bioinformatics Methodology Article BACKGROUND: Recent years have witnessed an increasing interest in multi-omics data, because these data allow for better understanding complex diseases such as cancer on a molecular system level. In addition, multi-omics data increase the chance to robustly identify molecular patient sub-groups and hence open the door towards a better personalized treatment of diseases. Several methods have been proposed for unsupervised clustering of multi-omics data. However, a number of challenges remain, such as the magnitude of features and the large difference in dimensionality across different omics data sources. RESULTS: We propose a multi-modal sparse denoising autoencoder framework coupled with sparse non-negative matrix factorization to robustly cluster patients based on multi-omics data. The proposed model specifically leverages pathway information to effectively reduce the dimensionality of omics data into a pathway and patient specific score profile. In consequence, our method allows us to understand, which pathway is a feature of which particular patient cluster. Moreover, recently proposed machine learning techniques allow us to disentangle the specific impact of each individual omics feature on a pathway score. We applied our method to cluster patients in several cancer datasets using gene expression, miRNA expression, DNA methylation and CNVs, demonstrating the possibility to obtain biologically plausible disease subtypes characterized by specific molecular features. Comparison against several competing methods showed a competitive clustering performance. In addition, post-hoc analysis of somatic mutations and clinical data provided supporting evidence and interpretation of the identified clusters. CONCLUSIONS: Our suggested multi-modal sparse denoising autoencoder approach allows for an effective and interpretable integration of multi-omics data on pathway level while addressing the high dimensional character of omics data. Patient specific pathway score profiles derived from our model allow for a robust identification of disease subgroups. BioMed Central 2020-04-16 /pmc/articles/PMC7161108/ /pubmed/32299344 http://dx.doi.org/10.1186/s12859-020-3465-2 Text en © The Author(s). 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology Article
Lemsara, Amina
Ouadfel, Salima
Fröhlich, Holger
PathME: pathway based multi-modal sparse autoencoders for clustering of patient-level multi-omics data
title PathME: pathway based multi-modal sparse autoencoders for clustering of patient-level multi-omics data
title_full PathME: pathway based multi-modal sparse autoencoders for clustering of patient-level multi-omics data
title_fullStr PathME: pathway based multi-modal sparse autoencoders for clustering of patient-level multi-omics data
title_full_unstemmed PathME: pathway based multi-modal sparse autoencoders for clustering of patient-level multi-omics data
title_short PathME: pathway based multi-modal sparse autoencoders for clustering of patient-level multi-omics data
title_sort pathme: pathway based multi-modal sparse autoencoders for clustering of patient-level multi-omics data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7161108/
https://www.ncbi.nlm.nih.gov/pubmed/32299344
http://dx.doi.org/10.1186/s12859-020-3465-2
work_keys_str_mv AT lemsaraamina pathmepathwaybasedmultimodalsparseautoencodersforclusteringofpatientlevelmultiomicsdata
AT ouadfelsalima pathmepathwaybasedmultimodalsparseautoencodersforclusteringofpatientlevelmultiomicsdata
AT frohlichholger pathmepathwaybasedmultimodalsparseautoencodersforclusteringofpatientlevelmultiomicsdata