Cargando…

Multi-omics Data Integration Model Based on UMAP Embedding and Convolutional Neural Network

INTRODUCTION: Multi-omics data integration facilitates collecting richer understanding and perceptions than separate omics data. Various promising integrative approaches have been utilized to analyze multi-omics data for biomedical applications, including disease prediction and disease subtypes, bio...

Descripción completa

Detalles Bibliográficos
Autores principales: ElKarami, Bashier, Alkhateeb, Abedalrhman, Qattous, Hazem, Alshomali, Lujain, Shahrrava, Behnam
Formato: Online Artículo Texto
Lenguaje:English
Publicado: SAGE Publications 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9523837/
https://www.ncbi.nlm.nih.gov/pubmed/36187912
http://dx.doi.org/10.1177/11769351221124205
_version_ 1784800374619635712
author ElKarami, Bashier
Alkhateeb, Abedalrhman
Qattous, Hazem
Alshomali, Lujain
Shahrrava, Behnam
author_facet ElKarami, Bashier
Alkhateeb, Abedalrhman
Qattous, Hazem
Alshomali, Lujain
Shahrrava, Behnam
author_sort ElKarami, Bashier
collection PubMed
description INTRODUCTION: Multi-omics data integration facilitates collecting richer understanding and perceptions than separate omics data. Various promising integrative approaches have been utilized to analyze multi-omics data for biomedical applications, including disease prediction and disease subtypes, biomarker prediction, and others. METHODS: In this paper, we introduce a multi-omics data integration method that is constructed using the combination of gene similarity network (GSN) based on uniform manifold approximation and projection (UMAP) and convolutional neural networks (CNNs). The method utilizes UMAP to embed gene expression, DNA methylation, and copy number alteration (CNA) to a lower dimension creating two-dimensional RGB images. Gene expression is used as a reference to construct the GSN and then integrate other omics data with the gene expression for better prediction. We used CNNs to predict the Gleason score levels of prostate cancer patients and the tumor stage in breast cancer patients. RESULTS: The model proposed near perfection with accuracy above 99% with all other performance measurements at the same level. The proposed model outperformed the state-of-art iSOM-GSN model that constructs the GSN map based on the self-organizing map. CONCLUSION: The results show that UMAP as an embedding technique can better integrate multi-omics maps into the prediction model than SOM. The proposed model can also be applied to build a multi-omics prediction model for other types of cancer.
format Online
Article
Text
id pubmed-9523837
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher SAGE Publications
record_format MEDLINE/PubMed
spelling pubmed-95238372022-10-01 Multi-omics Data Integration Model Based on UMAP Embedding and Convolutional Neural Network ElKarami, Bashier Alkhateeb, Abedalrhman Qattous, Hazem Alshomali, Lujain Shahrrava, Behnam Cancer Inform Original Research INTRODUCTION: Multi-omics data integration facilitates collecting richer understanding and perceptions than separate omics data. Various promising integrative approaches have been utilized to analyze multi-omics data for biomedical applications, including disease prediction and disease subtypes, biomarker prediction, and others. METHODS: In this paper, we introduce a multi-omics data integration method that is constructed using the combination of gene similarity network (GSN) based on uniform manifold approximation and projection (UMAP) and convolutional neural networks (CNNs). The method utilizes UMAP to embed gene expression, DNA methylation, and copy number alteration (CNA) to a lower dimension creating two-dimensional RGB images. Gene expression is used as a reference to construct the GSN and then integrate other omics data with the gene expression for better prediction. We used CNNs to predict the Gleason score levels of prostate cancer patients and the tumor stage in breast cancer patients. RESULTS: The model proposed near perfection with accuracy above 99% with all other performance measurements at the same level. The proposed model outperformed the state-of-art iSOM-GSN model that constructs the GSN map based on the self-organizing map. CONCLUSION: The results show that UMAP as an embedding technique can better integrate multi-omics maps into the prediction model than SOM. The proposed model can also be applied to build a multi-omics prediction model for other types of cancer. SAGE Publications 2022-09-28 /pmc/articles/PMC9523837/ /pubmed/36187912 http://dx.doi.org/10.1177/11769351221124205 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by-nc/4.0/This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (https://creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access page (https://us.sagepub.com/en-us/nam/open-access-at-sage).
spellingShingle Original Research
ElKarami, Bashier
Alkhateeb, Abedalrhman
Qattous, Hazem
Alshomali, Lujain
Shahrrava, Behnam
Multi-omics Data Integration Model Based on UMAP Embedding and Convolutional Neural Network
title Multi-omics Data Integration Model Based on UMAP Embedding and Convolutional Neural Network
title_full Multi-omics Data Integration Model Based on UMAP Embedding and Convolutional Neural Network
title_fullStr Multi-omics Data Integration Model Based on UMAP Embedding and Convolutional Neural Network
title_full_unstemmed Multi-omics Data Integration Model Based on UMAP Embedding and Convolutional Neural Network
title_short Multi-omics Data Integration Model Based on UMAP Embedding and Convolutional Neural Network
title_sort multi-omics data integration model based on umap embedding and convolutional neural network
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9523837/
https://www.ncbi.nlm.nih.gov/pubmed/36187912
http://dx.doi.org/10.1177/11769351221124205
work_keys_str_mv AT elkaramibashier multiomicsdataintegrationmodelbasedonumapembeddingandconvolutionalneuralnetwork
AT alkhateebabedalrhman multiomicsdataintegrationmodelbasedonumapembeddingandconvolutionalneuralnetwork
AT qattoushazem multiomicsdataintegrationmodelbasedonumapembeddingandconvolutionalneuralnetwork
AT alshomalilujain multiomicsdataintegrationmodelbasedonumapembeddingandconvolutionalneuralnetwork
AT shahrravabehnam multiomicsdataintegrationmodelbasedonumapembeddingandconvolutionalneuralnetwork