Cargando…

Bayesian Markov Random Field Analysis for Protein Function Prediction Based on Network Data

Inference of protein functions is one of the most important aims of modern biology. To fully exploit the large volumes of genomic data typically produced in modern-day genomic experiments, automated computational methods for protein function prediction are urgently needed. Established methods use se...

Descripción completa

Detalles Bibliográficos
Autores principales: Kourmpetis, Yiannis A. I., van Dijk, Aalt D. J., Bink, Marco C. A. M., van Ham, Roeland C. H. J., ter Braak, Cajo J. F.
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2827541/
https://www.ncbi.nlm.nih.gov/pubmed/20195360
http://dx.doi.org/10.1371/journal.pone.0009293
_version_ 1782177960981168128
author Kourmpetis, Yiannis A. I.
van Dijk, Aalt D. J.
Bink, Marco C. A. M.
van Ham, Roeland C. H. J.
ter Braak, Cajo J. F.
author_facet Kourmpetis, Yiannis A. I.
van Dijk, Aalt D. J.
Bink, Marco C. A. M.
van Ham, Roeland C. H. J.
ter Braak, Cajo J. F.
author_sort Kourmpetis, Yiannis A. I.
collection PubMed
description Inference of protein functions is one of the most important aims of modern biology. To fully exploit the large volumes of genomic data typically produced in modern-day genomic experiments, automated computational methods for protein function prediction are urgently needed. Established methods use sequence or structure similarity to infer functions but those types of data do not suffice to determine the biological context in which proteins act. Current high-throughput biological experiments produce large amounts of data on the interactions between proteins. Such data can be used to infer interaction networks and to predict the biological process that the protein is involved in. Here, we develop a probabilistic approach for protein function prediction using network data, such as protein-protein interaction measurements. We take a Bayesian approach to an existing Markov Random Field method by performing simultaneous estimation of the model parameters and prediction of protein functions. We use an adaptive Markov Chain Monte Carlo algorithm that leads to more accurate parameter estimates and consequently to improved prediction performance compared to the standard Markov Random Fields method. We tested our method using a high quality S.cereviciae validation network with 1622 proteins against 90 Gene Ontology terms of different levels of abstraction. Compared to three other protein function prediction methods, our approach shows very good prediction performance. Our method can be directly applied to protein-protein interaction or coexpression networks, but also can be extended to use multiple data sources. We apply our method to physical protein interaction data from S. cerevisiae and provide novel predictions, using 340 Gene Ontology terms, for 1170 unannotated proteins and we evaluate the predictions using the available literature.
format Text
id pubmed-2827541
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-28275412010-03-02 Bayesian Markov Random Field Analysis for Protein Function Prediction Based on Network Data Kourmpetis, Yiannis A. I. van Dijk, Aalt D. J. Bink, Marco C. A. M. van Ham, Roeland C. H. J. ter Braak, Cajo J. F. PLoS One Research Article Inference of protein functions is one of the most important aims of modern biology. To fully exploit the large volumes of genomic data typically produced in modern-day genomic experiments, automated computational methods for protein function prediction are urgently needed. Established methods use sequence or structure similarity to infer functions but those types of data do not suffice to determine the biological context in which proteins act. Current high-throughput biological experiments produce large amounts of data on the interactions between proteins. Such data can be used to infer interaction networks and to predict the biological process that the protein is involved in. Here, we develop a probabilistic approach for protein function prediction using network data, such as protein-protein interaction measurements. We take a Bayesian approach to an existing Markov Random Field method by performing simultaneous estimation of the model parameters and prediction of protein functions. We use an adaptive Markov Chain Monte Carlo algorithm that leads to more accurate parameter estimates and consequently to improved prediction performance compared to the standard Markov Random Fields method. We tested our method using a high quality S.cereviciae validation network with 1622 proteins against 90 Gene Ontology terms of different levels of abstraction. Compared to three other protein function prediction methods, our approach shows very good prediction performance. Our method can be directly applied to protein-protein interaction or coexpression networks, but also can be extended to use multiple data sources. We apply our method to physical protein interaction data from S. cerevisiae and provide novel predictions, using 340 Gene Ontology terms, for 1170 unannotated proteins and we evaluate the predictions using the available literature. Public Library of Science 2010-02-24 /pmc/articles/PMC2827541/ /pubmed/20195360 http://dx.doi.org/10.1371/journal.pone.0009293 Text en Kourmpetis et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Kourmpetis, Yiannis A. I.
van Dijk, Aalt D. J.
Bink, Marco C. A. M.
van Ham, Roeland C. H. J.
ter Braak, Cajo J. F.
Bayesian Markov Random Field Analysis for Protein Function Prediction Based on Network Data
title Bayesian Markov Random Field Analysis for Protein Function Prediction Based on Network Data
title_full Bayesian Markov Random Field Analysis for Protein Function Prediction Based on Network Data
title_fullStr Bayesian Markov Random Field Analysis for Protein Function Prediction Based on Network Data
title_full_unstemmed Bayesian Markov Random Field Analysis for Protein Function Prediction Based on Network Data
title_short Bayesian Markov Random Field Analysis for Protein Function Prediction Based on Network Data
title_sort bayesian markov random field analysis for protein function prediction based on network data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2827541/
https://www.ncbi.nlm.nih.gov/pubmed/20195360
http://dx.doi.org/10.1371/journal.pone.0009293
work_keys_str_mv AT kourmpetisyiannisai bayesianmarkovrandomfieldanalysisforproteinfunctionpredictionbasedonnetworkdata
AT vandijkaaltdj bayesianmarkovrandomfieldanalysisforproteinfunctionpredictionbasedonnetworkdata
AT binkmarcocam bayesianmarkovrandomfieldanalysisforproteinfunctionpredictionbasedonnetworkdata
AT vanhamroelandchj bayesianmarkovrandomfieldanalysisforproteinfunctionpredictionbasedonnetworkdata
AT terbraakcajojf bayesianmarkovrandomfieldanalysisforproteinfunctionpredictionbasedonnetworkdata