Cargando…

Domain prediction with probabilistic directional context

MOTIVATION: Protein domain prediction is one of the most powerful approaches for sequence-based function prediction. Although domain instances are typically predicted independently of each other, newer approaches have demonstrated improved performance by rewarding domain pairs that frequently co-occ...

Descripción completa

Detalles Bibliográficos
Autores principales: Ochoa, Alejandro, Singh, Mona
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5870623/
https://www.ncbi.nlm.nih.gov/pubmed/28407137
http://dx.doi.org/10.1093/bioinformatics/btx221
_version_ 1783309520533979136
author Ochoa, Alejandro
Singh, Mona
author_facet Ochoa, Alejandro
Singh, Mona
author_sort Ochoa, Alejandro
collection PubMed
description MOTIVATION: Protein domain prediction is one of the most powerful approaches for sequence-based function prediction. Although domain instances are typically predicted independently of each other, newer approaches have demonstrated improved performance by rewarding domain pairs that frequently co-occur within sequences. However, most of these approaches have ignored the order in which domains preferentially co-occur and have also not modeled domain co-occurrence probabilistically. RESULTS: We introduce a probabilistic approach for domain prediction that models ‘directional’ domain context. Our method is the first to score all domain pairs within a sequence while taking their order into account, even for non-sequential domains. We show that our approach extends a previous Markov model-based approach to additionally score all pairwise terms, and that it can be interpreted within the context of Markov random fields. We formulate our underlying combinatorial optimization problem as an integer linear program, and demonstrate that it can be solved quickly in practice. Finally, we perform extensive evaluation of domain context methods and demonstrate that incorporating context increases the number of domain predictions by ∼15%, with our approach dPUC2 (Domain Prediction Using Context) outperforming all competing approaches. AVAILABILITY AND IMPLEMENTATION: dPUC2 is available at http://github.com/alexviiia/dpuc2. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-5870623
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-58706232018-04-05 Domain prediction with probabilistic directional context Ochoa, Alejandro Singh, Mona Bioinformatics Original Papers MOTIVATION: Protein domain prediction is one of the most powerful approaches for sequence-based function prediction. Although domain instances are typically predicted independently of each other, newer approaches have demonstrated improved performance by rewarding domain pairs that frequently co-occur within sequences. However, most of these approaches have ignored the order in which domains preferentially co-occur and have also not modeled domain co-occurrence probabilistically. RESULTS: We introduce a probabilistic approach for domain prediction that models ‘directional’ domain context. Our method is the first to score all domain pairs within a sequence while taking their order into account, even for non-sequential domains. We show that our approach extends a previous Markov model-based approach to additionally score all pairwise terms, and that it can be interpreted within the context of Markov random fields. We formulate our underlying combinatorial optimization problem as an integer linear program, and demonstrate that it can be solved quickly in practice. Finally, we perform extensive evaluation of domain context methods and demonstrate that incorporating context increases the number of domain predictions by ∼15%, with our approach dPUC2 (Domain Prediction Using Context) outperforming all competing approaches. AVAILABILITY AND IMPLEMENTATION: dPUC2 is available at http://github.com/alexviiia/dpuc2. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2017-08-15 2017-04-12 /pmc/articles/PMC5870623/ /pubmed/28407137 http://dx.doi.org/10.1093/bioinformatics/btx221 Text en © The Author 2017. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Ochoa, Alejandro
Singh, Mona
Domain prediction with probabilistic directional context
title Domain prediction with probabilistic directional context
title_full Domain prediction with probabilistic directional context
title_fullStr Domain prediction with probabilistic directional context
title_full_unstemmed Domain prediction with probabilistic directional context
title_short Domain prediction with probabilistic directional context
title_sort domain prediction with probabilistic directional context
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5870623/
https://www.ncbi.nlm.nih.gov/pubmed/28407137
http://dx.doi.org/10.1093/bioinformatics/btx221
work_keys_str_mv AT ochoaalejandro domainpredictionwithprobabilisticdirectionalcontext
AT singhmona domainpredictionwithprobabilisticdirectionalcontext