Cargando…

New directions for diffusion-based network prediction of protein function: incorporating pathways with confidence

Motivation: It has long been hypothesized that incorporating models of network noise as well as edge directions and known pathway information into the representation of protein–protein interaction (PPI) networks might improve their utility for functional inference. However, a simple way to do this h...

Descripción completa

Detalles Bibliográficos
Autores principales: Cao, Mengfei, Pietras, Christopher M., Feng, Xian, Doroschak, Kathryn J., Schaffner, Thomas, Park, Jisoo, Zhang, Hao, Cowen, Lenore J., Hescott, Benjamin J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4058952/
https://www.ncbi.nlm.nih.gov/pubmed/24931987
http://dx.doi.org/10.1093/bioinformatics/btu263
_version_ 1782321193088450560
author Cao, Mengfei
Pietras, Christopher M.
Feng, Xian
Doroschak, Kathryn J.
Schaffner, Thomas
Park, Jisoo
Zhang, Hao
Cowen, Lenore J.
Hescott, Benjamin J.
author_facet Cao, Mengfei
Pietras, Christopher M.
Feng, Xian
Doroschak, Kathryn J.
Schaffner, Thomas
Park, Jisoo
Zhang, Hao
Cowen, Lenore J.
Hescott, Benjamin J.
author_sort Cao, Mengfei
collection PubMed
description Motivation: It has long been hypothesized that incorporating models of network noise as well as edge directions and known pathway information into the representation of protein–protein interaction (PPI) networks might improve their utility for functional inference. However, a simple way to do this has not been obvious. We find that diffusion state distance (DSD), our recent diffusion-based metric for measuring dissimilarity in PPI networks, has natural extensions that incorporate confidence, directions and can even express coherent pathways by calculating DSD on an augmented graph. Results: We define three incremental versions of DSD which we term cDSD, caDSD and capDSD, where the capDSD matrix incorporates confidence, known directed edges, and pathways into the measure of how similar each pair of nodes is according to the structure of the PPI network. We test four popular function prediction methods (majority vote, weighted majority vote, multi-way cut and functional flow) using these different matrices on the Baker’s yeast PPI network in cross-validation. The best performing method is weighted majority vote using capDSD. We then test the performance of our augmented DSD methods on an integrated heterogeneous set of protein association edges from the STRING database. The superior performance of capDSD in this context confirms that treating the pathways as probabilistic units is more powerful than simply incorporating pathway edges independently into the network. Availability: All source code for calculating the confidences, for extracting pathway information from KEGG XML files, and for calculating the cDSD, caDSD and capDSD matrices are available from http://dsd.cs.tufts.edu/capdsd Contact: lenore.cowen@tufts.edu or benjamin.hescott@tufts.edu Supplementary information: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-4058952
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-40589522014-06-18 New directions for diffusion-based network prediction of protein function: incorporating pathways with confidence Cao, Mengfei Pietras, Christopher M. Feng, Xian Doroschak, Kathryn J. Schaffner, Thomas Park, Jisoo Zhang, Hao Cowen, Lenore J. Hescott, Benjamin J. Bioinformatics Ismb 2014 Proceedings Papers Committee Motivation: It has long been hypothesized that incorporating models of network noise as well as edge directions and known pathway information into the representation of protein–protein interaction (PPI) networks might improve their utility for functional inference. However, a simple way to do this has not been obvious. We find that diffusion state distance (DSD), our recent diffusion-based metric for measuring dissimilarity in PPI networks, has natural extensions that incorporate confidence, directions and can even express coherent pathways by calculating DSD on an augmented graph. Results: We define three incremental versions of DSD which we term cDSD, caDSD and capDSD, where the capDSD matrix incorporates confidence, known directed edges, and pathways into the measure of how similar each pair of nodes is according to the structure of the PPI network. We test four popular function prediction methods (majority vote, weighted majority vote, multi-way cut and functional flow) using these different matrices on the Baker’s yeast PPI network in cross-validation. The best performing method is weighted majority vote using capDSD. We then test the performance of our augmented DSD methods on an integrated heterogeneous set of protein association edges from the STRING database. The superior performance of capDSD in this context confirms that treating the pathways as probabilistic units is more powerful than simply incorporating pathway edges independently into the network. Availability: All source code for calculating the confidences, for extracting pathway information from KEGG XML files, and for calculating the cDSD, caDSD and capDSD matrices are available from http://dsd.cs.tufts.edu/capdsd Contact: lenore.cowen@tufts.edu or benjamin.hescott@tufts.edu Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2014-06-15 2014-06-11 /pmc/articles/PMC4058952/ /pubmed/24931987 http://dx.doi.org/10.1093/bioinformatics/btu263 Text en © The Author 2014. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Ismb 2014 Proceedings Papers Committee
Cao, Mengfei
Pietras, Christopher M.
Feng, Xian
Doroschak, Kathryn J.
Schaffner, Thomas
Park, Jisoo
Zhang, Hao
Cowen, Lenore J.
Hescott, Benjamin J.
New directions for diffusion-based network prediction of protein function: incorporating pathways with confidence
title New directions for diffusion-based network prediction of protein function: incorporating pathways with confidence
title_full New directions for diffusion-based network prediction of protein function: incorporating pathways with confidence
title_fullStr New directions for diffusion-based network prediction of protein function: incorporating pathways with confidence
title_full_unstemmed New directions for diffusion-based network prediction of protein function: incorporating pathways with confidence
title_short New directions for diffusion-based network prediction of protein function: incorporating pathways with confidence
title_sort new directions for diffusion-based network prediction of protein function: incorporating pathways with confidence
topic Ismb 2014 Proceedings Papers Committee
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4058952/
https://www.ncbi.nlm.nih.gov/pubmed/24931987
http://dx.doi.org/10.1093/bioinformatics/btu263
work_keys_str_mv AT caomengfei newdirectionsfordiffusionbasednetworkpredictionofproteinfunctionincorporatingpathwayswithconfidence
AT pietraschristopherm newdirectionsfordiffusionbasednetworkpredictionofproteinfunctionincorporatingpathwayswithconfidence
AT fengxian newdirectionsfordiffusionbasednetworkpredictionofproteinfunctionincorporatingpathwayswithconfidence
AT doroschakkathrynj newdirectionsfordiffusionbasednetworkpredictionofproteinfunctionincorporatingpathwayswithconfidence
AT schaffnerthomas newdirectionsfordiffusionbasednetworkpredictionofproteinfunctionincorporatingpathwayswithconfidence
AT parkjisoo newdirectionsfordiffusionbasednetworkpredictionofproteinfunctionincorporatingpathwayswithconfidence
AT zhanghao newdirectionsfordiffusionbasednetworkpredictionofproteinfunctionincorporatingpathwayswithconfidence
AT cowenlenorej newdirectionsfordiffusionbasednetworkpredictionofproteinfunctionincorporatingpathwayswithconfidence
AT hescottbenjaminj newdirectionsfordiffusionbasednetworkpredictionofproteinfunctionincorporatingpathwayswithconfidence