Cargando…

DeepAutoGlioma: a deep learning autoencoder-based multi-omics data integration and classification tools for glioma subtyping

BACKGROUND AND OBJECTIVE: The classification of glioma subtypes is essential for precision therapy. Due to the heterogeneity of gliomas, the subtype-specific molecular pattern can be captured by integrating and analyzing high-throughput omics data from different genomic layers. The development of a...

Descripción completa

Detalles Bibliográficos
Autores principales: Munquad, Sana, Das, Asim Bikas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10652591/
https://www.ncbi.nlm.nih.gov/pubmed/37968655
http://dx.doi.org/10.1186/s13040-023-00349-7
_version_ 1785147716878204928
author Munquad, Sana
Das, Asim Bikas
author_facet Munquad, Sana
Das, Asim Bikas
author_sort Munquad, Sana
collection PubMed
description BACKGROUND AND OBJECTIVE: The classification of glioma subtypes is essential for precision therapy. Due to the heterogeneity of gliomas, the subtype-specific molecular pattern can be captured by integrating and analyzing high-throughput omics data from different genomic layers. The development of a deep-learning framework enables the integration of multi-omics data to classify the glioma subtypes to support the clinical diagnosis. RESULTS: Transcriptome and methylome data of glioma patients were preprocessed, and differentially expressed features from both datasets were identified. Subsequently, a Cox regression analysis determined genes and CpGs associated with survival. Gene set enrichment analysis was carried out to examine the biological significance of the features. Further, we identified CpG and gene pairs by mapping them in the promoter region of corresponding genes. The methylation and gene expression levels of these CpGs and genes were embedded in a lower-dimensional space with an autoencoder. Next, ANN and CNN were used to classify subtypes using the latent features from embedding space. CNN performs better than ANN for subtyping lower-grade gliomas (LGG) and glioblastoma multiforme (GBM). The subtyping accuracy of CNN was 98.03% (± 0.06) and 94.07% (± 0.01) in LGG and GBM, respectively. The precision of the models was 97.67% in LGG and 90.40% in GBM. The model sensitivity was 96.96% in LGG and 91.18% in GBM. Additionally, we observed the superior performance of CNN with external datasets. The genes and CpGs pairs used to develop the model showed better performance than the random CpGs-gene pairs, preprocessed data, and single omics data. CONCLUSIONS: The current study showed that a novel feature selection and data integration strategy led to the development of DeepAutoGlioma, an effective framework for diagnosing glioma subtypes. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13040-023-00349-7.
format Online
Article
Text
id pubmed-10652591
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-106525912023-11-15 DeepAutoGlioma: a deep learning autoencoder-based multi-omics data integration and classification tools for glioma subtyping Munquad, Sana Das, Asim Bikas BioData Min Research BACKGROUND AND OBJECTIVE: The classification of glioma subtypes is essential for precision therapy. Due to the heterogeneity of gliomas, the subtype-specific molecular pattern can be captured by integrating and analyzing high-throughput omics data from different genomic layers. The development of a deep-learning framework enables the integration of multi-omics data to classify the glioma subtypes to support the clinical diagnosis. RESULTS: Transcriptome and methylome data of glioma patients were preprocessed, and differentially expressed features from both datasets were identified. Subsequently, a Cox regression analysis determined genes and CpGs associated with survival. Gene set enrichment analysis was carried out to examine the biological significance of the features. Further, we identified CpG and gene pairs by mapping them in the promoter region of corresponding genes. The methylation and gene expression levels of these CpGs and genes were embedded in a lower-dimensional space with an autoencoder. Next, ANN and CNN were used to classify subtypes using the latent features from embedding space. CNN performs better than ANN for subtyping lower-grade gliomas (LGG) and glioblastoma multiforme (GBM). The subtyping accuracy of CNN was 98.03% (± 0.06) and 94.07% (± 0.01) in LGG and GBM, respectively. The precision of the models was 97.67% in LGG and 90.40% in GBM. The model sensitivity was 96.96% in LGG and 91.18% in GBM. Additionally, we observed the superior performance of CNN with external datasets. The genes and CpGs pairs used to develop the model showed better performance than the random CpGs-gene pairs, preprocessed data, and single omics data. CONCLUSIONS: The current study showed that a novel feature selection and data integration strategy led to the development of DeepAutoGlioma, an effective framework for diagnosing glioma subtypes. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13040-023-00349-7. BioMed Central 2023-11-15 /pmc/articles/PMC10652591/ /pubmed/37968655 http://dx.doi.org/10.1186/s13040-023-00349-7 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Munquad, Sana
Das, Asim Bikas
DeepAutoGlioma: a deep learning autoencoder-based multi-omics data integration and classification tools for glioma subtyping
title DeepAutoGlioma: a deep learning autoencoder-based multi-omics data integration and classification tools for glioma subtyping
title_full DeepAutoGlioma: a deep learning autoencoder-based multi-omics data integration and classification tools for glioma subtyping
title_fullStr DeepAutoGlioma: a deep learning autoencoder-based multi-omics data integration and classification tools for glioma subtyping
title_full_unstemmed DeepAutoGlioma: a deep learning autoencoder-based multi-omics data integration and classification tools for glioma subtyping
title_short DeepAutoGlioma: a deep learning autoencoder-based multi-omics data integration and classification tools for glioma subtyping
title_sort deepautoglioma: a deep learning autoencoder-based multi-omics data integration and classification tools for glioma subtyping
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10652591/
https://www.ncbi.nlm.nih.gov/pubmed/37968655
http://dx.doi.org/10.1186/s13040-023-00349-7
work_keys_str_mv AT munquadsana deepautogliomaadeeplearningautoencoderbasedmultiomicsdataintegrationandclassificationtoolsforgliomasubtyping
AT dasasimbikas deepautogliomaadeeplearningautoencoderbasedmultiomicsdataintegrationandclassificationtoolsforgliomasubtyping