Cargando…
Integrated multi-omics analysis of ovarian cancer using variational autoencoders
Cancer is a complex disease that deregulates cellular functions at various molecular levels (e.g., DNA, RNA, and proteins). Integrated multi-omics analysis of data from these levels is necessary to understand the aberrant cellular functions accountable for cancer and its development. In recent years...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7973750/ https://www.ncbi.nlm.nih.gov/pubmed/33737557 http://dx.doi.org/10.1038/s41598-021-85285-4 |
_version_ | 1783666887066910720 |
---|---|
author | Hira, Muta Tah Razzaque, M. A. Angione, Claudio Scrivens, James Sawan, Saladin Sarker, Mosharraf |
author_facet | Hira, Muta Tah Razzaque, M. A. Angione, Claudio Scrivens, James Sawan, Saladin Sarker, Mosharraf |
author_sort | Hira, Muta Tah |
collection | PubMed |
description | Cancer is a complex disease that deregulates cellular functions at various molecular levels (e.g., DNA, RNA, and proteins). Integrated multi-omics analysis of data from these levels is necessary to understand the aberrant cellular functions accountable for cancer and its development. In recent years, Deep Learning (DL) approaches have become a useful tool in integrated multi-omics analysis of cancer data. However, high dimensional multi-omics data are generally imbalanced with too many molecular features and relatively few patient samples. This imbalance makes a DL based integrated multi-omics analysis difficult. DL-based dimensionality reduction technique, including variational autoencoder (VAE), is a potential solution to balance high dimensional multi-omics data. However, there are few VAE-based integrated multi-omics analyses, and they are limited to pancancer. In this work, we did an integrated multi-omics analysis of ovarian cancer using the compressed features learned through VAE and an improved version of VAE, namely Maximum Mean Discrepancy VAE (MMD-VAE). First, we designed and developed a DL architecture for VAE and MMD-VAE. Then we used the architecture for mono-omics, integrated di-omics and tri-omics data analysis of ovarian cancer through cancer samples identification, molecular subtypes clustering and classification, and survival analysis. The results show that MMD-VAE and VAE-based compressed features can respectively classify the transcriptional subtypes of the TCGA datasets with an accuracy in the range of 93.2-95.5% and 87.1-95.7%. Also, survival analysis results show that VAE and MMD-VAE based compressed representation of omics data can be used in cancer prognosis. Based on the results, we can conclude that (i) VAE and MMD-VAE outperform existing dimensionality reduction techniques, (ii) integrated multi-omics analyses perform better or similar compared to their mono-omics counterparts, and (iii) MMD-VAE performs better than VAE in most omics dataset. |
format | Online Article Text |
id | pubmed-7973750 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-79737502021-03-19 Integrated multi-omics analysis of ovarian cancer using variational autoencoders Hira, Muta Tah Razzaque, M. A. Angione, Claudio Scrivens, James Sawan, Saladin Sarker, Mosharraf Sci Rep Article Cancer is a complex disease that deregulates cellular functions at various molecular levels (e.g., DNA, RNA, and proteins). Integrated multi-omics analysis of data from these levels is necessary to understand the aberrant cellular functions accountable for cancer and its development. In recent years, Deep Learning (DL) approaches have become a useful tool in integrated multi-omics analysis of cancer data. However, high dimensional multi-omics data are generally imbalanced with too many molecular features and relatively few patient samples. This imbalance makes a DL based integrated multi-omics analysis difficult. DL-based dimensionality reduction technique, including variational autoencoder (VAE), is a potential solution to balance high dimensional multi-omics data. However, there are few VAE-based integrated multi-omics analyses, and they are limited to pancancer. In this work, we did an integrated multi-omics analysis of ovarian cancer using the compressed features learned through VAE and an improved version of VAE, namely Maximum Mean Discrepancy VAE (MMD-VAE). First, we designed and developed a DL architecture for VAE and MMD-VAE. Then we used the architecture for mono-omics, integrated di-omics and tri-omics data analysis of ovarian cancer through cancer samples identification, molecular subtypes clustering and classification, and survival analysis. The results show that MMD-VAE and VAE-based compressed features can respectively classify the transcriptional subtypes of the TCGA datasets with an accuracy in the range of 93.2-95.5% and 87.1-95.7%. Also, survival analysis results show that VAE and MMD-VAE based compressed representation of omics data can be used in cancer prognosis. Based on the results, we can conclude that (i) VAE and MMD-VAE outperform existing dimensionality reduction techniques, (ii) integrated multi-omics analyses perform better or similar compared to their mono-omics counterparts, and (iii) MMD-VAE performs better than VAE in most omics dataset. Nature Publishing Group UK 2021-03-18 /pmc/articles/PMC7973750/ /pubmed/33737557 http://dx.doi.org/10.1038/s41598-021-85285-4 Text en © The Author(s) 2021, corrected publication 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Hira, Muta Tah Razzaque, M. A. Angione, Claudio Scrivens, James Sawan, Saladin Sarker, Mosharraf Integrated multi-omics analysis of ovarian cancer using variational autoencoders |
title | Integrated multi-omics analysis of ovarian cancer using variational autoencoders |
title_full | Integrated multi-omics analysis of ovarian cancer using variational autoencoders |
title_fullStr | Integrated multi-omics analysis of ovarian cancer using variational autoencoders |
title_full_unstemmed | Integrated multi-omics analysis of ovarian cancer using variational autoencoders |
title_short | Integrated multi-omics analysis of ovarian cancer using variational autoencoders |
title_sort | integrated multi-omics analysis of ovarian cancer using variational autoencoders |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7973750/ https://www.ncbi.nlm.nih.gov/pubmed/33737557 http://dx.doi.org/10.1038/s41598-021-85285-4 |
work_keys_str_mv | AT hiramutatah integratedmultiomicsanalysisofovariancancerusingvariationalautoencoders AT razzaquema integratedmultiomicsanalysisofovariancancerusingvariationalautoencoders AT angioneclaudio integratedmultiomicsanalysisofovariancancerusingvariationalautoencoders AT scrivensjames integratedmultiomicsanalysisofovariancancerusingvariationalautoencoders AT sawansaladin integratedmultiomicsanalysisofovariancancerusingvariationalautoencoders AT sarkermosharraf integratedmultiomicsanalysisofovariancancerusingvariationalautoencoders |