Cargando…

Multi-dimensional data integration algorithm based on random walk with restart

BACKGROUND: The accumulation of various multi-omics data and computational approaches for data integration can accelerate the development of precision medicine. However, the algorithm development for multi-omics data integration remains a pressing challenge. RESULTS: Here, we propose a multi-omics d...

Descripción completa

Detalles Bibliográficos
Autores principales: Wen, Yuqi, Song, Xinyu, Yan, Bowei, Yang, Xiaoxi, Wu, Lianlian, Leng, Dongjin, He, Song, Bo, Xiaochen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7912853/
https://www.ncbi.nlm.nih.gov/pubmed/33639858
http://dx.doi.org/10.1186/s12859-021-04029-3
_version_ 1783656671778701312
author Wen, Yuqi
Song, Xinyu
Yan, Bowei
Yang, Xiaoxi
Wu, Lianlian
Leng, Dongjin
He, Song
Bo, Xiaochen
author_facet Wen, Yuqi
Song, Xinyu
Yan, Bowei
Yang, Xiaoxi
Wu, Lianlian
Leng, Dongjin
He, Song
Bo, Xiaochen
author_sort Wen, Yuqi
collection PubMed
description BACKGROUND: The accumulation of various multi-omics data and computational approaches for data integration can accelerate the development of precision medicine. However, the algorithm development for multi-omics data integration remains a pressing challenge. RESULTS: Here, we propose a multi-omics data integration algorithm based on random walk with restart (RWR) on multiplex network. We call the resulting methodology Random Walk with Restart for multi-dimensional data Fusion (RWRF). RWRF uses similarity network of samples as the basis for integration. It constructs the similarity network for each data type and then connects corresponding samples of multiple similarity networks to create a multiplex sample network. By applying RWR on the multiplex network, RWRF uses stationary probability distribution to fuse similarity networks. We applied RWRF to The Cancer Genome Atlas (TCGA) data to identify subtypes in different cancer data sets. Three types of data (mRNA expression, DNA methylation, and microRNA expression data) are integrated and network clustering is conducted. Experiment results show that RWRF performs better than single data type analysis and previous integrative methods. CONCLUSIONS: RWRF provides powerful support to users to decipher the cancer molecular subtypes, thus may benefit precision treatment of specific patients in clinical practice.
format Online
Article
Text
id pubmed-7912853
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-79128532021-03-02 Multi-dimensional data integration algorithm based on random walk with restart Wen, Yuqi Song, Xinyu Yan, Bowei Yang, Xiaoxi Wu, Lianlian Leng, Dongjin He, Song Bo, Xiaochen BMC Bioinformatics Methodology Article BACKGROUND: The accumulation of various multi-omics data and computational approaches for data integration can accelerate the development of precision medicine. However, the algorithm development for multi-omics data integration remains a pressing challenge. RESULTS: Here, we propose a multi-omics data integration algorithm based on random walk with restart (RWR) on multiplex network. We call the resulting methodology Random Walk with Restart for multi-dimensional data Fusion (RWRF). RWRF uses similarity network of samples as the basis for integration. It constructs the similarity network for each data type and then connects corresponding samples of multiple similarity networks to create a multiplex sample network. By applying RWR on the multiplex network, RWRF uses stationary probability distribution to fuse similarity networks. We applied RWRF to The Cancer Genome Atlas (TCGA) data to identify subtypes in different cancer data sets. Three types of data (mRNA expression, DNA methylation, and microRNA expression data) are integrated and network clustering is conducted. Experiment results show that RWRF performs better than single data type analysis and previous integrative methods. CONCLUSIONS: RWRF provides powerful support to users to decipher the cancer molecular subtypes, thus may benefit precision treatment of specific patients in clinical practice. BioMed Central 2021-02-27 /pmc/articles/PMC7912853/ /pubmed/33639858 http://dx.doi.org/10.1186/s12859-021-04029-3 Text en © The Author(s) 2021 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology Article
Wen, Yuqi
Song, Xinyu
Yan, Bowei
Yang, Xiaoxi
Wu, Lianlian
Leng, Dongjin
He, Song
Bo, Xiaochen
Multi-dimensional data integration algorithm based on random walk with restart
title Multi-dimensional data integration algorithm based on random walk with restart
title_full Multi-dimensional data integration algorithm based on random walk with restart
title_fullStr Multi-dimensional data integration algorithm based on random walk with restart
title_full_unstemmed Multi-dimensional data integration algorithm based on random walk with restart
title_short Multi-dimensional data integration algorithm based on random walk with restart
title_sort multi-dimensional data integration algorithm based on random walk with restart
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7912853/
https://www.ncbi.nlm.nih.gov/pubmed/33639858
http://dx.doi.org/10.1186/s12859-021-04029-3
work_keys_str_mv AT wenyuqi multidimensionaldataintegrationalgorithmbasedonrandomwalkwithrestart
AT songxinyu multidimensionaldataintegrationalgorithmbasedonrandomwalkwithrestart
AT yanbowei multidimensionaldataintegrationalgorithmbasedonrandomwalkwithrestart
AT yangxiaoxi multidimensionaldataintegrationalgorithmbasedonrandomwalkwithrestart
AT wulianlian multidimensionaldataintegrationalgorithmbasedonrandomwalkwithrestart
AT lengdongjin multidimensionaldataintegrationalgorithmbasedonrandomwalkwithrestart
AT hesong multidimensionaldataintegrationalgorithmbasedonrandomwalkwithrestart
AT boxiaochen multidimensionaldataintegrationalgorithmbasedonrandomwalkwithrestart