Cargando…

Integrated multi-omics analysis of ovarian cancer using variational autoencoders

Cancer is a complex disease that deregulates cellular functions at various molecular levels (e.g., DNA, RNA, and proteins). Integrated multi-omics analysis of data from these levels is necessary to understand the aberrant cellular functions accountable for cancer and its development. In recent years...

Descripción completa

Detalles Bibliográficos
Autores principales: Hira, Muta Tah, Razzaque, M. A., Angione, Claudio, Scrivens, James, Sawan, Saladin, Sarker, Mosharraf
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7973750/
https://www.ncbi.nlm.nih.gov/pubmed/33737557
http://dx.doi.org/10.1038/s41598-021-85285-4
_version_ 1783666887066910720
author Hira, Muta Tah
Razzaque, M. A.
Angione, Claudio
Scrivens, James
Sawan, Saladin
Sarker, Mosharraf
author_facet Hira, Muta Tah
Razzaque, M. A.
Angione, Claudio
Scrivens, James
Sawan, Saladin
Sarker, Mosharraf
author_sort Hira, Muta Tah
collection PubMed
description Cancer is a complex disease that deregulates cellular functions at various molecular levels (e.g., DNA, RNA, and proteins). Integrated multi-omics analysis of data from these levels is necessary to understand the aberrant cellular functions accountable for cancer and its development. In recent years, Deep Learning (DL) approaches have become a useful tool in integrated multi-omics analysis of cancer data. However, high dimensional multi-omics data are generally imbalanced with too many molecular features and relatively few patient samples. This imbalance makes a DL based integrated multi-omics analysis difficult. DL-based dimensionality reduction technique, including variational autoencoder (VAE), is a potential solution to balance high dimensional multi-omics data. However, there are few VAE-based integrated multi-omics analyses, and they are limited to pancancer. In this work, we did an integrated multi-omics analysis of ovarian cancer using the compressed features learned through VAE and an improved version of VAE, namely Maximum Mean Discrepancy VAE (MMD-VAE). First, we designed and developed a DL architecture for VAE and MMD-VAE. Then we used the architecture for mono-omics, integrated di-omics and tri-omics data analysis of ovarian cancer through cancer samples identification, molecular subtypes clustering and classification, and survival analysis. The results show that MMD-VAE and VAE-based compressed features can respectively classify the transcriptional subtypes of the TCGA datasets with an accuracy in the range of 93.2-95.5% and 87.1-95.7%. Also, survival analysis results show that VAE and MMD-VAE based compressed representation of omics data can be used in cancer prognosis. Based on the results, we can conclude that (i) VAE and MMD-VAE outperform existing dimensionality reduction techniques, (ii) integrated multi-omics analyses perform better or similar compared to their mono-omics counterparts, and (iii) MMD-VAE performs better than VAE in most omics dataset.
format Online
Article
Text
id pubmed-7973750
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-79737502021-03-19 Integrated multi-omics analysis of ovarian cancer using variational autoencoders Hira, Muta Tah Razzaque, M. A. Angione, Claudio Scrivens, James Sawan, Saladin Sarker, Mosharraf Sci Rep Article Cancer is a complex disease that deregulates cellular functions at various molecular levels (e.g., DNA, RNA, and proteins). Integrated multi-omics analysis of data from these levels is necessary to understand the aberrant cellular functions accountable for cancer and its development. In recent years, Deep Learning (DL) approaches have become a useful tool in integrated multi-omics analysis of cancer data. However, high dimensional multi-omics data are generally imbalanced with too many molecular features and relatively few patient samples. This imbalance makes a DL based integrated multi-omics analysis difficult. DL-based dimensionality reduction technique, including variational autoencoder (VAE), is a potential solution to balance high dimensional multi-omics data. However, there are few VAE-based integrated multi-omics analyses, and they are limited to pancancer. In this work, we did an integrated multi-omics analysis of ovarian cancer using the compressed features learned through VAE and an improved version of VAE, namely Maximum Mean Discrepancy VAE (MMD-VAE). First, we designed and developed a DL architecture for VAE and MMD-VAE. Then we used the architecture for mono-omics, integrated di-omics and tri-omics data analysis of ovarian cancer through cancer samples identification, molecular subtypes clustering and classification, and survival analysis. The results show that MMD-VAE and VAE-based compressed features can respectively classify the transcriptional subtypes of the TCGA datasets with an accuracy in the range of 93.2-95.5% and 87.1-95.7%. Also, survival analysis results show that VAE and MMD-VAE based compressed representation of omics data can be used in cancer prognosis. Based on the results, we can conclude that (i) VAE and MMD-VAE outperform existing dimensionality reduction techniques, (ii) integrated multi-omics analyses perform better or similar compared to their mono-omics counterparts, and (iii) MMD-VAE performs better than VAE in most omics dataset. Nature Publishing Group UK 2021-03-18 /pmc/articles/PMC7973750/ /pubmed/33737557 http://dx.doi.org/10.1038/s41598-021-85285-4 Text en © The Author(s) 2021, corrected publication 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Hira, Muta Tah
Razzaque, M. A.
Angione, Claudio
Scrivens, James
Sawan, Saladin
Sarker, Mosharraf
Integrated multi-omics analysis of ovarian cancer using variational autoencoders
title Integrated multi-omics analysis of ovarian cancer using variational autoencoders
title_full Integrated multi-omics analysis of ovarian cancer using variational autoencoders
title_fullStr Integrated multi-omics analysis of ovarian cancer using variational autoencoders
title_full_unstemmed Integrated multi-omics analysis of ovarian cancer using variational autoencoders
title_short Integrated multi-omics analysis of ovarian cancer using variational autoencoders
title_sort integrated multi-omics analysis of ovarian cancer using variational autoencoders
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7973750/
https://www.ncbi.nlm.nih.gov/pubmed/33737557
http://dx.doi.org/10.1038/s41598-021-85285-4
work_keys_str_mv AT hiramutatah integratedmultiomicsanalysisofovariancancerusingvariationalautoencoders
AT razzaquema integratedmultiomicsanalysisofovariancancerusingvariationalautoencoders
AT angioneclaudio integratedmultiomicsanalysisofovariancancerusingvariationalautoencoders
AT scrivensjames integratedmultiomicsanalysisofovariancancerusingvariationalautoencoders
AT sawansaladin integratedmultiomicsanalysisofovariancancerusingvariationalautoencoders
AT sarkermosharraf integratedmultiomicsanalysisofovariancancerusingvariationalautoencoders