Cargando…
Multi-dimensional data integration algorithm based on random walk with restart
BACKGROUND: The accumulation of various multi-omics data and computational approaches for data integration can accelerate the development of precision medicine. However, the algorithm development for multi-omics data integration remains a pressing challenge. RESULTS: Here, we propose a multi-omics d...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7912853/ https://www.ncbi.nlm.nih.gov/pubmed/33639858 http://dx.doi.org/10.1186/s12859-021-04029-3 |
_version_ | 1783656671778701312 |
---|---|
author | Wen, Yuqi Song, Xinyu Yan, Bowei Yang, Xiaoxi Wu, Lianlian Leng, Dongjin He, Song Bo, Xiaochen |
author_facet | Wen, Yuqi Song, Xinyu Yan, Bowei Yang, Xiaoxi Wu, Lianlian Leng, Dongjin He, Song Bo, Xiaochen |
author_sort | Wen, Yuqi |
collection | PubMed |
description | BACKGROUND: The accumulation of various multi-omics data and computational approaches for data integration can accelerate the development of precision medicine. However, the algorithm development for multi-omics data integration remains a pressing challenge. RESULTS: Here, we propose a multi-omics data integration algorithm based on random walk with restart (RWR) on multiplex network. We call the resulting methodology Random Walk with Restart for multi-dimensional data Fusion (RWRF). RWRF uses similarity network of samples as the basis for integration. It constructs the similarity network for each data type and then connects corresponding samples of multiple similarity networks to create a multiplex sample network. By applying RWR on the multiplex network, RWRF uses stationary probability distribution to fuse similarity networks. We applied RWRF to The Cancer Genome Atlas (TCGA) data to identify subtypes in different cancer data sets. Three types of data (mRNA expression, DNA methylation, and microRNA expression data) are integrated and network clustering is conducted. Experiment results show that RWRF performs better than single data type analysis and previous integrative methods. CONCLUSIONS: RWRF provides powerful support to users to decipher the cancer molecular subtypes, thus may benefit precision treatment of specific patients in clinical practice. |
format | Online Article Text |
id | pubmed-7912853 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-79128532021-03-02 Multi-dimensional data integration algorithm based on random walk with restart Wen, Yuqi Song, Xinyu Yan, Bowei Yang, Xiaoxi Wu, Lianlian Leng, Dongjin He, Song Bo, Xiaochen BMC Bioinformatics Methodology Article BACKGROUND: The accumulation of various multi-omics data and computational approaches for data integration can accelerate the development of precision medicine. However, the algorithm development for multi-omics data integration remains a pressing challenge. RESULTS: Here, we propose a multi-omics data integration algorithm based on random walk with restart (RWR) on multiplex network. We call the resulting methodology Random Walk with Restart for multi-dimensional data Fusion (RWRF). RWRF uses similarity network of samples as the basis for integration. It constructs the similarity network for each data type and then connects corresponding samples of multiple similarity networks to create a multiplex sample network. By applying RWR on the multiplex network, RWRF uses stationary probability distribution to fuse similarity networks. We applied RWRF to The Cancer Genome Atlas (TCGA) data to identify subtypes in different cancer data sets. Three types of data (mRNA expression, DNA methylation, and microRNA expression data) are integrated and network clustering is conducted. Experiment results show that RWRF performs better than single data type analysis and previous integrative methods. CONCLUSIONS: RWRF provides powerful support to users to decipher the cancer molecular subtypes, thus may benefit precision treatment of specific patients in clinical practice. BioMed Central 2021-02-27 /pmc/articles/PMC7912853/ /pubmed/33639858 http://dx.doi.org/10.1186/s12859-021-04029-3 Text en © The Author(s) 2021 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Methodology Article Wen, Yuqi Song, Xinyu Yan, Bowei Yang, Xiaoxi Wu, Lianlian Leng, Dongjin He, Song Bo, Xiaochen Multi-dimensional data integration algorithm based on random walk with restart |
title | Multi-dimensional data integration algorithm based on random walk with restart |
title_full | Multi-dimensional data integration algorithm based on random walk with restart |
title_fullStr | Multi-dimensional data integration algorithm based on random walk with restart |
title_full_unstemmed | Multi-dimensional data integration algorithm based on random walk with restart |
title_short | Multi-dimensional data integration algorithm based on random walk with restart |
title_sort | multi-dimensional data integration algorithm based on random walk with restart |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7912853/ https://www.ncbi.nlm.nih.gov/pubmed/33639858 http://dx.doi.org/10.1186/s12859-021-04029-3 |
work_keys_str_mv | AT wenyuqi multidimensionaldataintegrationalgorithmbasedonrandomwalkwithrestart AT songxinyu multidimensionaldataintegrationalgorithmbasedonrandomwalkwithrestart AT yanbowei multidimensionaldataintegrationalgorithmbasedonrandomwalkwithrestart AT yangxiaoxi multidimensionaldataintegrationalgorithmbasedonrandomwalkwithrestart AT wulianlian multidimensionaldataintegrationalgorithmbasedonrandomwalkwithrestart AT lengdongjin multidimensionaldataintegrationalgorithmbasedonrandomwalkwithrestart AT hesong multidimensionaldataintegrationalgorithmbasedonrandomwalkwithrestart AT boxiaochen multidimensionaldataintegrationalgorithmbasedonrandomwalkwithrestart |