Cargando…

A random walk-based method to identify driver genes by integrating the subcellular localization and variation frequency into bipartite graph

BACKGROUND: Cancer as a worldwide problem is driven by genomic alterations. With the advent of high-throughput sequencing technology, a huge amount of genomic data generates at every second which offer many valuable cancer information and meanwhile throw a big challenge to those investigators. As th...

Descripción completa

Detalles Bibliográficos
Autores principales:	Song, Junrong, Peng, Wei, Wang, Feng
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2019
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6518800/ https://www.ncbi.nlm.nih.gov/pubmed/31088372 http://dx.doi.org/10.1186/s12859-019-2847-9

_version_	1783418532167417856
author	Song, Junrong Peng, Wei Wang, Feng
author_facet	Song, Junrong Peng, Wei Wang, Feng
author_sort	Song, Junrong
collection	PubMed
description	BACKGROUND: Cancer as a worldwide problem is driven by genomic alterations. With the advent of high-throughput sequencing technology, a huge amount of genomic data generates at every second which offer many valuable cancer information and meanwhile throw a big challenge to those investigators. As the major characteristic of cancer is heterogeneity and most of alterations are supposed to be useless passenger mutations that make no contribution to the cancer progress. Hence, how to dig out driver genes that have effect on a selective growth advantage in tumor cells from those tremendously and noisily data is still an urgent task. RESULTS: Considering previous network-based method ignoring some important biological properties of driver genes and the low reliability of gene interactive network, we proposed a random walk method named as Subdyquency that integrates the information of subcellular localization, variation frequency and its interaction with other dysregulated genes to improve the prediction accuracy of driver genes. We applied our model to three different cancers: lung, prostate and breast cancer. The results show our model can not only identify the well-known important driver genes but also prioritize the rare unknown driver genes. Besides, compared with other existing methods, our method can improve the precision, recall and fscore to a higher level for most of cancer types. CONCLUSIONS: The final results imply that driver genes are those prone to have higher variation frequency and impact more dysregulated genes in the common significant compartment. AVAILABILITY: The source code can be obtained at https://github.com/weiba/Subdyquency. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2847-9) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-6518800
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-65188002019-05-21 A random walk-based method to identify driver genes by integrating the subcellular localization and variation frequency into bipartite graph Song, Junrong Peng, Wei Wang, Feng BMC Bioinformatics Research Article BACKGROUND: Cancer as a worldwide problem is driven by genomic alterations. With the advent of high-throughput sequencing technology, a huge amount of genomic data generates at every second which offer many valuable cancer information and meanwhile throw a big challenge to those investigators. As the major characteristic of cancer is heterogeneity and most of alterations are supposed to be useless passenger mutations that make no contribution to the cancer progress. Hence, how to dig out driver genes that have effect on a selective growth advantage in tumor cells from those tremendously and noisily data is still an urgent task. RESULTS: Considering previous network-based method ignoring some important biological properties of driver genes and the low reliability of gene interactive network, we proposed a random walk method named as Subdyquency that integrates the information of subcellular localization, variation frequency and its interaction with other dysregulated genes to improve the prediction accuracy of driver genes. We applied our model to three different cancers: lung, prostate and breast cancer. The results show our model can not only identify the well-known important driver genes but also prioritize the rare unknown driver genes. Besides, compared with other existing methods, our method can improve the precision, recall and fscore to a higher level for most of cancer types. CONCLUSIONS: The final results imply that driver genes are those prone to have higher variation frequency and impact more dysregulated genes in the common significant compartment. AVAILABILITY: The source code can be obtained at https://github.com/weiba/Subdyquency. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2847-9) contains supplementary material, which is available to authorized users. BioMed Central 2019-05-14 /pmc/articles/PMC6518800/ /pubmed/31088372 http://dx.doi.org/10.1186/s12859-019-2847-9 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Article Song, Junrong Peng, Wei Wang, Feng A random walk-based method to identify driver genes by integrating the subcellular localization and variation frequency into bipartite graph
title	A random walk-based method to identify driver genes by integrating the subcellular localization and variation frequency into bipartite graph
title_full	A random walk-based method to identify driver genes by integrating the subcellular localization and variation frequency into bipartite graph
title_fullStr	A random walk-based method to identify driver genes by integrating the subcellular localization and variation frequency into bipartite graph
title_full_unstemmed	A random walk-based method to identify driver genes by integrating the subcellular localization and variation frequency into bipartite graph
title_short	A random walk-based method to identify driver genes by integrating the subcellular localization and variation frequency into bipartite graph
title_sort	random walk-based method to identify driver genes by integrating the subcellular localization and variation frequency into bipartite graph
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6518800/ https://www.ncbi.nlm.nih.gov/pubmed/31088372 http://dx.doi.org/10.1186/s12859-019-2847-9
work_keys_str_mv	AT songjunrong arandomwalkbasedmethodtoidentifydrivergenesbyintegratingthesubcellularlocalizationandvariationfrequencyintobipartitegraph AT pengwei arandomwalkbasedmethodtoidentifydrivergenesbyintegratingthesubcellularlocalizationandvariationfrequencyintobipartitegraph AT wangfeng arandomwalkbasedmethodtoidentifydrivergenesbyintegratingthesubcellularlocalizationandvariationfrequencyintobipartitegraph AT songjunrong randomwalkbasedmethodtoidentifydrivergenesbyintegratingthesubcellularlocalizationandvariationfrequencyintobipartitegraph AT pengwei randomwalkbasedmethodtoidentifydrivergenesbyintegratingthesubcellularlocalizationandvariationfrequencyintobipartitegraph AT wangfeng randomwalkbasedmethodtoidentifydrivergenesbyintegratingthesubcellularlocalizationandvariationfrequencyintobipartitegraph

A random walk-based method to identify driver genes by integrating the subcellular localization and variation frequency into bipartite graph

Ejemplares similares