Cargando…

Protein Molecular Function Prediction by Bayesian Phylogenomics

We present a statistical graphical model to infer specific molecular function for unannotated protein sequences using homology. Based on phylogenomic principles, SIFTER (Statistical Inference of Function Through Evolutionary Relationships) accurately predicts molecular function for members of a prot...

Descripción completa

Detalles Bibliográficos
Autores principales: Engelhardt, Barbara E, Jordan, Michael I, Muratore, Kathryn E, Brenner, Steven E
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1246806/
https://www.ncbi.nlm.nih.gov/pubmed/16217548
http://dx.doi.org/10.1371/journal.pcbi.0010045
_version_ 1782125636373970944
author Engelhardt, Barbara E
Jordan, Michael I
Muratore, Kathryn E
Brenner, Steven E
author_facet Engelhardt, Barbara E
Jordan, Michael I
Muratore, Kathryn E
Brenner, Steven E
author_sort Engelhardt, Barbara E
collection PubMed
description We present a statistical graphical model to infer specific molecular function for unannotated protein sequences using homology. Based on phylogenomic principles, SIFTER (Statistical Inference of Function Through Evolutionary Relationships) accurately predicts molecular function for members of a protein family given a reconciled phylogeny and available function annotations, even when the data are sparse or noisy. Our method produced specific and consistent molecular function predictions across 100 Pfam families in comparison to the Gene Ontology annotation database, BLAST, GOtcha, and Orthostrapper. We performed a more detailed exploration of functional predictions on the adenosine-5′-monophosphate/adenosine deaminase family and the lactate/malate dehydrogenase family, in the former case comparing the predictions against a gold standard set of published functional characterizations. Given function annotations for 3% of the proteins in the deaminase family, SIFTER achieves 96% accuracy in predicting molecular function for experimentally characterized proteins as reported in the literature. The accuracy of SIFTER on this dataset is a significant improvement over other currently available methods such as BLAST (75%), GeneQuiz (64%), GOtcha (89%), and Orthostrapper (11%). We also experimentally characterized the adenosine deaminase from Plasmodium falciparum, confirming SIFTER's prediction. The results illustrate the predictive power of exploiting a statistical model of function evolution in phylogenomic problems. A software implementation of SIFTER is available from the authors.
format Text
id pubmed-1246806
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-12468062005-10-07 Protein Molecular Function Prediction by Bayesian Phylogenomics Engelhardt, Barbara E Jordan, Michael I Muratore, Kathryn E Brenner, Steven E PLoS Comput Biol Research Article We present a statistical graphical model to infer specific molecular function for unannotated protein sequences using homology. Based on phylogenomic principles, SIFTER (Statistical Inference of Function Through Evolutionary Relationships) accurately predicts molecular function for members of a protein family given a reconciled phylogeny and available function annotations, even when the data are sparse or noisy. Our method produced specific and consistent molecular function predictions across 100 Pfam families in comparison to the Gene Ontology annotation database, BLAST, GOtcha, and Orthostrapper. We performed a more detailed exploration of functional predictions on the adenosine-5′-monophosphate/adenosine deaminase family and the lactate/malate dehydrogenase family, in the former case comparing the predictions against a gold standard set of published functional characterizations. Given function annotations for 3% of the proteins in the deaminase family, SIFTER achieves 96% accuracy in predicting molecular function for experimentally characterized proteins as reported in the literature. The accuracy of SIFTER on this dataset is a significant improvement over other currently available methods such as BLAST (75%), GeneQuiz (64%), GOtcha (89%), and Orthostrapper (11%). We also experimentally characterized the adenosine deaminase from Plasmodium falciparum, confirming SIFTER's prediction. The results illustrate the predictive power of exploiting a statistical model of function evolution in phylogenomic problems. A software implementation of SIFTER is available from the authors. Public Library of Science 2005-10 2005-10-07 /pmc/articles/PMC1246806/ /pubmed/16217548 http://dx.doi.org/10.1371/journal.pcbi.0010045 Text en Copyright: © 2005 Engelhardt et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Engelhardt, Barbara E
Jordan, Michael I
Muratore, Kathryn E
Brenner, Steven E
Protein Molecular Function Prediction by Bayesian Phylogenomics
title Protein Molecular Function Prediction by Bayesian Phylogenomics
title_full Protein Molecular Function Prediction by Bayesian Phylogenomics
title_fullStr Protein Molecular Function Prediction by Bayesian Phylogenomics
title_full_unstemmed Protein Molecular Function Prediction by Bayesian Phylogenomics
title_short Protein Molecular Function Prediction by Bayesian Phylogenomics
title_sort protein molecular function prediction by bayesian phylogenomics
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1246806/
https://www.ncbi.nlm.nih.gov/pubmed/16217548
http://dx.doi.org/10.1371/journal.pcbi.0010045
work_keys_str_mv AT engelhardtbarbarae proteinmolecularfunctionpredictionbybayesianphylogenomics
AT jordanmichaeli proteinmolecularfunctionpredictionbybayesianphylogenomics
AT muratorekathryne proteinmolecularfunctionpredictionbybayesianphylogenomics
AT brennerstevene proteinmolecularfunctionpredictionbybayesianphylogenomics