Cargando…

An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods

OBJECTIVE: In the context of “network medicine”, gene prioritization methods represent one of the main tools to discover candidate disease genes by exploiting the large amount of data covering different types of functional relationships between genes. Several works proposed to integrate multiple sou...

Descripción completa

Detalles Bibliográficos
Autores principales: Valentini, Giorgio, Paccanaro, Alberto, Caniza, Horacio, Romero, Alfonso E., Re, Matteo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier Science Publishing 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4070077/
https://www.ncbi.nlm.nih.gov/pubmed/24726035
http://dx.doi.org/10.1016/j.artmed.2014.03.003
_version_ 1782322637088751616
author Valentini, Giorgio
Paccanaro, Alberto
Caniza, Horacio
Romero, Alfonso E.
Re, Matteo
author_facet Valentini, Giorgio
Paccanaro, Alberto
Caniza, Horacio
Romero, Alfonso E.
Re, Matteo
author_sort Valentini, Giorgio
collection PubMed
description OBJECTIVE: In the context of “network medicine”, gene prioritization methods represent one of the main tools to discover candidate disease genes by exploiting the large amount of data covering different types of functional relationships between genes. Several works proposed to integrate multiple sources of data to improve disease gene prioritization, but to our knowledge no systematic studies focused on the quantitative evaluation of the impact of network integration on gene prioritization. In this paper, we aim at providing an extensive analysis of gene-disease associations not limited to genetic disorders, and a systematic comparison of different network integration methods for gene prioritization. MATERIALS AND METHODS: We collected nine different functional networks representing different functional relationships between genes, and we combined them through both unweighted and weighted network integration methods. We then prioritized genes with respect to each of the considered 708 medical subject headings (MeSH) diseases by applying classical guilt-by-association, random walk and random walk with restart algorithms, and the recently proposed kernelized score functions. RESULTS: The results obtained with classical random walk algorithms and the best single network achieved an average area under the curve (AUC) across the 708 MeSH diseases of about 0.82, while kernelized score functions and network integration boosted the average AUC to about 0.89. Weighted integration, by exploiting the different “informativeness” embedded in different functional networks, outperforms unweighted integration at 0.01 significance level, according to the Wilcoxon signed rank sum test. For each MeSH disease we provide the top-ranked unannotated candidate genes, available for further bio-medical investigation. CONCLUSIONS: Network integration is necessary to boost the performances of gene prioritization methods. Moreover the methods based on kernelized score functions can further enhance disease gene ranking results, by adopting both local and global learning strategies, able to exploit the overall topology of the network.
format Online
Article
Text
id pubmed-4070077
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Elsevier Science Publishing
record_format MEDLINE/PubMed
spelling pubmed-40700772014-06-26 An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods Valentini, Giorgio Paccanaro, Alberto Caniza, Horacio Romero, Alfonso E. Re, Matteo Artif Intell Med Article OBJECTIVE: In the context of “network medicine”, gene prioritization methods represent one of the main tools to discover candidate disease genes by exploiting the large amount of data covering different types of functional relationships between genes. Several works proposed to integrate multiple sources of data to improve disease gene prioritization, but to our knowledge no systematic studies focused on the quantitative evaluation of the impact of network integration on gene prioritization. In this paper, we aim at providing an extensive analysis of gene-disease associations not limited to genetic disorders, and a systematic comparison of different network integration methods for gene prioritization. MATERIALS AND METHODS: We collected nine different functional networks representing different functional relationships between genes, and we combined them through both unweighted and weighted network integration methods. We then prioritized genes with respect to each of the considered 708 medical subject headings (MeSH) diseases by applying classical guilt-by-association, random walk and random walk with restart algorithms, and the recently proposed kernelized score functions. RESULTS: The results obtained with classical random walk algorithms and the best single network achieved an average area under the curve (AUC) across the 708 MeSH diseases of about 0.82, while kernelized score functions and network integration boosted the average AUC to about 0.89. Weighted integration, by exploiting the different “informativeness” embedded in different functional networks, outperforms unweighted integration at 0.01 significance level, according to the Wilcoxon signed rank sum test. For each MeSH disease we provide the top-ranked unannotated candidate genes, available for further bio-medical investigation. CONCLUSIONS: Network integration is necessary to boost the performances of gene prioritization methods. Moreover the methods based on kernelized score functions can further enhance disease gene ranking results, by adopting both local and global learning strategies, able to exploit the overall topology of the network. Elsevier Science Publishing 2014-06 /pmc/articles/PMC4070077/ /pubmed/24726035 http://dx.doi.org/10.1016/j.artmed.2014.03.003 Text en © 2014 The Authors http://creativecommons.org/licenses/by/3.0/ This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/3.0/).
spellingShingle Article
Valentini, Giorgio
Paccanaro, Alberto
Caniza, Horacio
Romero, Alfonso E.
Re, Matteo
An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods
title An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods
title_full An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods
title_fullStr An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods
title_full_unstemmed An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods
title_short An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods
title_sort extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4070077/
https://www.ncbi.nlm.nih.gov/pubmed/24726035
http://dx.doi.org/10.1016/j.artmed.2014.03.003
work_keys_str_mv AT valentinigiorgio anextensiveanalysisofdiseasegeneassociationsusingnetworkintegrationandfastkernelbasedgeneprioritizationmethods
AT paccanaroalberto anextensiveanalysisofdiseasegeneassociationsusingnetworkintegrationandfastkernelbasedgeneprioritizationmethods
AT canizahoracio anextensiveanalysisofdiseasegeneassociationsusingnetworkintegrationandfastkernelbasedgeneprioritizationmethods
AT romeroalfonsoe anextensiveanalysisofdiseasegeneassociationsusingnetworkintegrationandfastkernelbasedgeneprioritizationmethods
AT rematteo anextensiveanalysisofdiseasegeneassociationsusingnetworkintegrationandfastkernelbasedgeneprioritizationmethods
AT valentinigiorgio extensiveanalysisofdiseasegeneassociationsusingnetworkintegrationandfastkernelbasedgeneprioritizationmethods
AT paccanaroalberto extensiveanalysisofdiseasegeneassociationsusingnetworkintegrationandfastkernelbasedgeneprioritizationmethods
AT canizahoracio extensiveanalysisofdiseasegeneassociationsusingnetworkintegrationandfastkernelbasedgeneprioritizationmethods
AT romeroalfonsoe extensiveanalysisofdiseasegeneassociationsusingnetworkintegrationandfastkernelbasedgeneprioritizationmethods
AT rematteo extensiveanalysisofdiseasegeneassociationsusingnetworkintegrationandfastkernelbasedgeneprioritizationmethods