Cargando…
New directions for diffusion-based network prediction of protein function: incorporating pathways with confidence
Motivation: It has long been hypothesized that incorporating models of network noise as well as edge directions and known pathway information into the representation of protein–protein interaction (PPI) networks might improve their utility for functional inference. However, a simple way to do this h...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4058952/ https://www.ncbi.nlm.nih.gov/pubmed/24931987 http://dx.doi.org/10.1093/bioinformatics/btu263 |
_version_ | 1782321193088450560 |
---|---|
author | Cao, Mengfei Pietras, Christopher M. Feng, Xian Doroschak, Kathryn J. Schaffner, Thomas Park, Jisoo Zhang, Hao Cowen, Lenore J. Hescott, Benjamin J. |
author_facet | Cao, Mengfei Pietras, Christopher M. Feng, Xian Doroschak, Kathryn J. Schaffner, Thomas Park, Jisoo Zhang, Hao Cowen, Lenore J. Hescott, Benjamin J. |
author_sort | Cao, Mengfei |
collection | PubMed |
description | Motivation: It has long been hypothesized that incorporating models of network noise as well as edge directions and known pathway information into the representation of protein–protein interaction (PPI) networks might improve their utility for functional inference. However, a simple way to do this has not been obvious. We find that diffusion state distance (DSD), our recent diffusion-based metric for measuring dissimilarity in PPI networks, has natural extensions that incorporate confidence, directions and can even express coherent pathways by calculating DSD on an augmented graph. Results: We define three incremental versions of DSD which we term cDSD, caDSD and capDSD, where the capDSD matrix incorporates confidence, known directed edges, and pathways into the measure of how similar each pair of nodes is according to the structure of the PPI network. We test four popular function prediction methods (majority vote, weighted majority vote, multi-way cut and functional flow) using these different matrices on the Baker’s yeast PPI network in cross-validation. The best performing method is weighted majority vote using capDSD. We then test the performance of our augmented DSD methods on an integrated heterogeneous set of protein association edges from the STRING database. The superior performance of capDSD in this context confirms that treating the pathways as probabilistic units is more powerful than simply incorporating pathway edges independently into the network. Availability: All source code for calculating the confidences, for extracting pathway information from KEGG XML files, and for calculating the cDSD, caDSD and capDSD matrices are available from http://dsd.cs.tufts.edu/capdsd Contact: lenore.cowen@tufts.edu or benjamin.hescott@tufts.edu Supplementary information: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-4058952 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-40589522014-06-18 New directions for diffusion-based network prediction of protein function: incorporating pathways with confidence Cao, Mengfei Pietras, Christopher M. Feng, Xian Doroschak, Kathryn J. Schaffner, Thomas Park, Jisoo Zhang, Hao Cowen, Lenore J. Hescott, Benjamin J. Bioinformatics Ismb 2014 Proceedings Papers Committee Motivation: It has long been hypothesized that incorporating models of network noise as well as edge directions and known pathway information into the representation of protein–protein interaction (PPI) networks might improve their utility for functional inference. However, a simple way to do this has not been obvious. We find that diffusion state distance (DSD), our recent diffusion-based metric for measuring dissimilarity in PPI networks, has natural extensions that incorporate confidence, directions and can even express coherent pathways by calculating DSD on an augmented graph. Results: We define three incremental versions of DSD which we term cDSD, caDSD and capDSD, where the capDSD matrix incorporates confidence, known directed edges, and pathways into the measure of how similar each pair of nodes is according to the structure of the PPI network. We test four popular function prediction methods (majority vote, weighted majority vote, multi-way cut and functional flow) using these different matrices on the Baker’s yeast PPI network in cross-validation. The best performing method is weighted majority vote using capDSD. We then test the performance of our augmented DSD methods on an integrated heterogeneous set of protein association edges from the STRING database. The superior performance of capDSD in this context confirms that treating the pathways as probabilistic units is more powerful than simply incorporating pathway edges independently into the network. Availability: All source code for calculating the confidences, for extracting pathway information from KEGG XML files, and for calculating the cDSD, caDSD and capDSD matrices are available from http://dsd.cs.tufts.edu/capdsd Contact: lenore.cowen@tufts.edu or benjamin.hescott@tufts.edu Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2014-06-15 2014-06-11 /pmc/articles/PMC4058952/ /pubmed/24931987 http://dx.doi.org/10.1093/bioinformatics/btu263 Text en © The Author 2014. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Ismb 2014 Proceedings Papers Committee Cao, Mengfei Pietras, Christopher M. Feng, Xian Doroschak, Kathryn J. Schaffner, Thomas Park, Jisoo Zhang, Hao Cowen, Lenore J. Hescott, Benjamin J. New directions for diffusion-based network prediction of protein function: incorporating pathways with confidence |
title | New directions for diffusion-based network prediction of protein function: incorporating pathways with confidence |
title_full | New directions for diffusion-based network prediction of protein function: incorporating pathways with confidence |
title_fullStr | New directions for diffusion-based network prediction of protein function: incorporating pathways with confidence |
title_full_unstemmed | New directions for diffusion-based network prediction of protein function: incorporating pathways with confidence |
title_short | New directions for diffusion-based network prediction of protein function: incorporating pathways with confidence |
title_sort | new directions for diffusion-based network prediction of protein function: incorporating pathways with confidence |
topic | Ismb 2014 Proceedings Papers Committee |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4058952/ https://www.ncbi.nlm.nih.gov/pubmed/24931987 http://dx.doi.org/10.1093/bioinformatics/btu263 |
work_keys_str_mv | AT caomengfei newdirectionsfordiffusionbasednetworkpredictionofproteinfunctionincorporatingpathwayswithconfidence AT pietraschristopherm newdirectionsfordiffusionbasednetworkpredictionofproteinfunctionincorporatingpathwayswithconfidence AT fengxian newdirectionsfordiffusionbasednetworkpredictionofproteinfunctionincorporatingpathwayswithconfidence AT doroschakkathrynj newdirectionsfordiffusionbasednetworkpredictionofproteinfunctionincorporatingpathwayswithconfidence AT schaffnerthomas newdirectionsfordiffusionbasednetworkpredictionofproteinfunctionincorporatingpathwayswithconfidence AT parkjisoo newdirectionsfordiffusionbasednetworkpredictionofproteinfunctionincorporatingpathwayswithconfidence AT zhanghao newdirectionsfordiffusionbasednetworkpredictionofproteinfunctionincorporatingpathwayswithconfidence AT cowenlenorej newdirectionsfordiffusionbasednetworkpredictionofproteinfunctionincorporatingpathwayswithconfidence AT hescottbenjaminj newdirectionsfordiffusionbasednetworkpredictionofproteinfunctionincorporatingpathwayswithconfidence |