Cargando…

SMURFLite: combining simplified Markov random fields with simulated evolution improves remote homology detection for beta-structural proteins into the twilight zone

Motivation: One of the most successful methods to date for recognizing protein sequences that are evolutionarily related has been profile hidden Markov models (HMMs). However, these models do not capture pairwise statistical preferences of residues that are hydrogen bonded in beta sheets. These depe...

Descripción completa

Detalles Bibliográficos
Autores principales: Daniels, Noah M., Hosur, Raghavendra, Berger, Bonnie, Cowen, Lenore J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3338012/
https://www.ncbi.nlm.nih.gov/pubmed/22408192
http://dx.doi.org/10.1093/bioinformatics/bts110
_version_ 1782231146137911296
author Daniels, Noah M.
Hosur, Raghavendra
Berger, Bonnie
Cowen, Lenore J.
author_facet Daniels, Noah M.
Hosur, Raghavendra
Berger, Bonnie
Cowen, Lenore J.
author_sort Daniels, Noah M.
collection PubMed
description Motivation: One of the most successful methods to date for recognizing protein sequences that are evolutionarily related has been profile hidden Markov models (HMMs). However, these models do not capture pairwise statistical preferences of residues that are hydrogen bonded in beta sheets. These dependencies have been partially captured in the HMM setting by simulated evolution in the training phase and can be fully captured by Markov random fields (MRFs). However, the MRFs can be computationally prohibitive when beta strands are interleaved in complex topologies. We introduce SMURFLite, a method that combines both simplified MRFs and simulated evolution to substantially improve remote homology detection for beta structures. Unlike previous MRF-based methods, SMURFLite is computationally feasible on any beta-structural motif. Results: We test SMURFLite on all propeller and barrel folds in the mainly-beta class of the SCOP hierarchy in stringent cross-validation experiments. We show a mean 26% (median 16%) improvement in area under curve (AUC) for beta-structural motif recognition as compared with HMMER (a well-known HMM method) and a mean 33% (median 19%) improvement as compared with RAPTOR (a well-known threading method) and even a mean 18% (median 10%) improvement in AUC over HHPred (a profile–profile HMM method), despite HHpred's use of extensive additional training data. We demonstrate SMURFLite's ability to scale to whole genomes by running a SMURFLite library of 207 beta-structural SCOP superfamilies against the entire genome of Thermotoga maritima, and make over a 100 new fold predictions. Availability and implementaion: A webserver that runs SMURFLite is available at: http://smurf.cs.tufts.edu/smurflite/ Contact: lenore.cowen@tufts.edu; bab@mit.edu
format Online
Article
Text
id pubmed-3338012
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-33380122012-04-27 SMURFLite: combining simplified Markov random fields with simulated evolution improves remote homology detection for beta-structural proteins into the twilight zone Daniels, Noah M. Hosur, Raghavendra Berger, Bonnie Cowen, Lenore J. Bioinformatics Original Papers Motivation: One of the most successful methods to date for recognizing protein sequences that are evolutionarily related has been profile hidden Markov models (HMMs). However, these models do not capture pairwise statistical preferences of residues that are hydrogen bonded in beta sheets. These dependencies have been partially captured in the HMM setting by simulated evolution in the training phase and can be fully captured by Markov random fields (MRFs). However, the MRFs can be computationally prohibitive when beta strands are interleaved in complex topologies. We introduce SMURFLite, a method that combines both simplified MRFs and simulated evolution to substantially improve remote homology detection for beta structures. Unlike previous MRF-based methods, SMURFLite is computationally feasible on any beta-structural motif. Results: We test SMURFLite on all propeller and barrel folds in the mainly-beta class of the SCOP hierarchy in stringent cross-validation experiments. We show a mean 26% (median 16%) improvement in area under curve (AUC) for beta-structural motif recognition as compared with HMMER (a well-known HMM method) and a mean 33% (median 19%) improvement as compared with RAPTOR (a well-known threading method) and even a mean 18% (median 10%) improvement in AUC over HHPred (a profile–profile HMM method), despite HHpred's use of extensive additional training data. We demonstrate SMURFLite's ability to scale to whole genomes by running a SMURFLite library of 207 beta-structural SCOP superfamilies against the entire genome of Thermotoga maritima, and make over a 100 new fold predictions. Availability and implementaion: A webserver that runs SMURFLite is available at: http://smurf.cs.tufts.edu/smurflite/ Contact: lenore.cowen@tufts.edu; bab@mit.edu Oxford University Press 2012-05-01 2012-03-09 /pmc/articles/PMC3338012/ /pubmed/22408192 http://dx.doi.org/10.1093/bioinformatics/bts110 Text en © The Author(s) 2012. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Daniels, Noah M.
Hosur, Raghavendra
Berger, Bonnie
Cowen, Lenore J.
SMURFLite: combining simplified Markov random fields with simulated evolution improves remote homology detection for beta-structural proteins into the twilight zone
title SMURFLite: combining simplified Markov random fields with simulated evolution improves remote homology detection for beta-structural proteins into the twilight zone
title_full SMURFLite: combining simplified Markov random fields with simulated evolution improves remote homology detection for beta-structural proteins into the twilight zone
title_fullStr SMURFLite: combining simplified Markov random fields with simulated evolution improves remote homology detection for beta-structural proteins into the twilight zone
title_full_unstemmed SMURFLite: combining simplified Markov random fields with simulated evolution improves remote homology detection for beta-structural proteins into the twilight zone
title_short SMURFLite: combining simplified Markov random fields with simulated evolution improves remote homology detection for beta-structural proteins into the twilight zone
title_sort smurflite: combining simplified markov random fields with simulated evolution improves remote homology detection for beta-structural proteins into the twilight zone
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3338012/
https://www.ncbi.nlm.nih.gov/pubmed/22408192
http://dx.doi.org/10.1093/bioinformatics/bts110
work_keys_str_mv AT danielsnoahm smurflitecombiningsimplifiedmarkovrandomfieldswithsimulatedevolutionimprovesremotehomologydetectionforbetastructuralproteinsintothetwilightzone
AT hosurraghavendra smurflitecombiningsimplifiedmarkovrandomfieldswithsimulatedevolutionimprovesremotehomologydetectionforbetastructuralproteinsintothetwilightzone
AT bergerbonnie smurflitecombiningsimplifiedmarkovrandomfieldswithsimulatedevolutionimprovesremotehomologydetectionforbetastructuralproteinsintothetwilightzone
AT cowenlenorej smurflitecombiningsimplifiedmarkovrandomfieldswithsimulatedevolutionimprovesremotehomologydetectionforbetastructuralproteinsintothetwilightzone