Cargando…

Constructing an integrated gene similarity network for the identification of disease genes

BACKGROUND: Discovering novel genes that are involved human diseases is a challenging task in biomedical research. In recent years, several computational approaches have been proposed to prioritize candidate disease genes. Most of these methods are mainly based on protein-protein interaction (PPI) n...

Descripción completa

Detalles Bibliográficos
Autores principales: Tian, Zhen, Guo, Maozu, Wang, Chunyu, Xing, LinLin, Wang, Lei, Zhang, Yin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5763299/
https://www.ncbi.nlm.nih.gov/pubmed/29297379
http://dx.doi.org/10.1186/s13326-017-0141-1
_version_ 1783291856789962752
author Tian, Zhen
Guo, Maozu
Wang, Chunyu
Xing, LinLin
Wang, Lei
Zhang, Yin
author_facet Tian, Zhen
Guo, Maozu
Wang, Chunyu
Xing, LinLin
Wang, Lei
Zhang, Yin
author_sort Tian, Zhen
collection PubMed
description BACKGROUND: Discovering novel genes that are involved human diseases is a challenging task in biomedical research. In recent years, several computational approaches have been proposed to prioritize candidate disease genes. Most of these methods are mainly based on protein-protein interaction (PPI) networks. However, since these PPI networks contain false positives and only cover less half of known human genes, their reliability and coverage are very low. Therefore, it is highly necessary to fuse multiple genomic data to construct a credible gene similarity network and then infer disease genes on the whole genomic scale. RESULTS: We proposed a novel method, named RWRB, to infer causal genes of interested diseases. First, we construct five individual gene (protein) similarity networks based on multiple genomic data of human genes. Then, an integrated gene similarity network (IGSN) is reconstructed based on similarity network fusion (SNF) method. Finally, we employee the random walk with restart algorithm on the phenotype-gene bilayer network, which combines phenotype similarity network, IGSN as well as phenotype-gene association network, to prioritize candidate disease genes. We investigate the effectiveness of RWRB through leave-one-out cross-validation methods in inferring phenotype-gene relationships. Results show that RWRB is more accurate than state-of-the-art methods on most evaluation metrics. Further analysis shows that the success of RWRB is benefited from IGSN which has a wider coverage and higher reliability comparing with current PPI networks. Moreover, we conduct a comprehensive case study for Alzheimer’s disease and predict some novel disease genes that supported by literature. CONCLUSIONS: RWRB is an effective and reliable algorithm in prioritizing candidate disease genes on the genomic scale. Software and supplementary information are available at http://nclab.hit.edu.cn/~tianzhen/RWRB/.
format Online
Article
Text
id pubmed-5763299
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-57632992018-01-17 Constructing an integrated gene similarity network for the identification of disease genes Tian, Zhen Guo, Maozu Wang, Chunyu Xing, LinLin Wang, Lei Zhang, Yin J Biomed Semantics Research BACKGROUND: Discovering novel genes that are involved human diseases is a challenging task in biomedical research. In recent years, several computational approaches have been proposed to prioritize candidate disease genes. Most of these methods are mainly based on protein-protein interaction (PPI) networks. However, since these PPI networks contain false positives and only cover less half of known human genes, their reliability and coverage are very low. Therefore, it is highly necessary to fuse multiple genomic data to construct a credible gene similarity network and then infer disease genes on the whole genomic scale. RESULTS: We proposed a novel method, named RWRB, to infer causal genes of interested diseases. First, we construct five individual gene (protein) similarity networks based on multiple genomic data of human genes. Then, an integrated gene similarity network (IGSN) is reconstructed based on similarity network fusion (SNF) method. Finally, we employee the random walk with restart algorithm on the phenotype-gene bilayer network, which combines phenotype similarity network, IGSN as well as phenotype-gene association network, to prioritize candidate disease genes. We investigate the effectiveness of RWRB through leave-one-out cross-validation methods in inferring phenotype-gene relationships. Results show that RWRB is more accurate than state-of-the-art methods on most evaluation metrics. Further analysis shows that the success of RWRB is benefited from IGSN which has a wider coverage and higher reliability comparing with current PPI networks. Moreover, we conduct a comprehensive case study for Alzheimer’s disease and predict some novel disease genes that supported by literature. CONCLUSIONS: RWRB is an effective and reliable algorithm in prioritizing candidate disease genes on the genomic scale. Software and supplementary information are available at http://nclab.hit.edu.cn/~tianzhen/RWRB/. BioMed Central 2017-09-20 /pmc/articles/PMC5763299/ /pubmed/29297379 http://dx.doi.org/10.1186/s13326-017-0141-1 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Tian, Zhen
Guo, Maozu
Wang, Chunyu
Xing, LinLin
Wang, Lei
Zhang, Yin
Constructing an integrated gene similarity network for the identification of disease genes
title Constructing an integrated gene similarity network for the identification of disease genes
title_full Constructing an integrated gene similarity network for the identification of disease genes
title_fullStr Constructing an integrated gene similarity network for the identification of disease genes
title_full_unstemmed Constructing an integrated gene similarity network for the identification of disease genes
title_short Constructing an integrated gene similarity network for the identification of disease genes
title_sort constructing an integrated gene similarity network for the identification of disease genes
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5763299/
https://www.ncbi.nlm.nih.gov/pubmed/29297379
http://dx.doi.org/10.1186/s13326-017-0141-1
work_keys_str_mv AT tianzhen constructinganintegratedgenesimilaritynetworkfortheidentificationofdiseasegenes
AT guomaozu constructinganintegratedgenesimilaritynetworkfortheidentificationofdiseasegenes
AT wangchunyu constructinganintegratedgenesimilaritynetworkfortheidentificationofdiseasegenes
AT xinglinlin constructinganintegratedgenesimilaritynetworkfortheidentificationofdiseasegenes
AT wanglei constructinganintegratedgenesimilaritynetworkfortheidentificationofdiseasegenes
AT zhangyin constructinganintegratedgenesimilaritynetworkfortheidentificationofdiseasegenes