Cargando…

Effusion: prediction of protein function from sequence similarity networks

MOTIVATION: Critical evaluation of methods for protein function prediction shows that data integration improves the performance of methods that predict protein function, but a basic BLAST-based method is still a top contender. We sought to engineer a method that modernizes the classical approach whi...

Descripción completa

Detalles Bibliográficos
Autores principales: Yunes, Jeffrey M, Babbitt, Patricia C
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6361244/
https://www.ncbi.nlm.nih.gov/pubmed/30084920
http://dx.doi.org/10.1093/bioinformatics/bty672
_version_ 1783392656010772480
author Yunes, Jeffrey M
Babbitt, Patricia C
author_facet Yunes, Jeffrey M
Babbitt, Patricia C
author_sort Yunes, Jeffrey M
collection PubMed
description MOTIVATION: Critical evaluation of methods for protein function prediction shows that data integration improves the performance of methods that predict protein function, but a basic BLAST-based method is still a top contender. We sought to engineer a method that modernizes the classical approach while avoiding pitfalls common to state-of-the-art methods. RESULTS: We present a method for predicting protein function, Effusion, which uses a sequence similarity network to add context for homology transfer, a probabilistic model to account for the uncertainty in labels and function propagation, and the structure of the Gene Ontology (GO) to best utilize sparse input labels and make consistent output predictions. Effusion’s model makes it practical to integrate rare experimental data and abundant primary sequence and sequence similarity. We demonstrate Effusion’s performance using a critical evaluation method and provide an in-depth analysis. We also dissect the design decisions we used to address challenges for predicting protein function. Finally, we propose directions in which the framework of the method can be modified for additional predictive power. AVAILABILITY AND IMPLEMENTATION: The source code for an implementation of Effusion is freely available at https://github.com/babbittlab/effusion. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-6361244
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-63612442019-02-08 Effusion: prediction of protein function from sequence similarity networks Yunes, Jeffrey M Babbitt, Patricia C Bioinformatics Original Papers MOTIVATION: Critical evaluation of methods for protein function prediction shows that data integration improves the performance of methods that predict protein function, but a basic BLAST-based method is still a top contender. We sought to engineer a method that modernizes the classical approach while avoiding pitfalls common to state-of-the-art methods. RESULTS: We present a method for predicting protein function, Effusion, which uses a sequence similarity network to add context for homology transfer, a probabilistic model to account for the uncertainty in labels and function propagation, and the structure of the Gene Ontology (GO) to best utilize sparse input labels and make consistent output predictions. Effusion’s model makes it practical to integrate rare experimental data and abundant primary sequence and sequence similarity. We demonstrate Effusion’s performance using a critical evaluation method and provide an in-depth analysis. We also dissect the design decisions we used to address challenges for predicting protein function. Finally, we propose directions in which the framework of the method can be modified for additional predictive power. AVAILABILITY AND IMPLEMENTATION: The source code for an implementation of Effusion is freely available at https://github.com/babbittlab/effusion. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2019-02-01 2018-08-01 /pmc/articles/PMC6361244/ /pubmed/30084920 http://dx.doi.org/10.1093/bioinformatics/bty672 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Original Papers
Yunes, Jeffrey M
Babbitt, Patricia C
Effusion: prediction of protein function from sequence similarity networks
title Effusion: prediction of protein function from sequence similarity networks
title_full Effusion: prediction of protein function from sequence similarity networks
title_fullStr Effusion: prediction of protein function from sequence similarity networks
title_full_unstemmed Effusion: prediction of protein function from sequence similarity networks
title_short Effusion: prediction of protein function from sequence similarity networks
title_sort effusion: prediction of protein function from sequence similarity networks
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6361244/
https://www.ncbi.nlm.nih.gov/pubmed/30084920
http://dx.doi.org/10.1093/bioinformatics/bty672
work_keys_str_mv AT yunesjeffreym effusionpredictionofproteinfunctionfromsequencesimilaritynetworks
AT babbittpatriciac effusionpredictionofproteinfunctionfromsequencesimilaritynetworks