Cargando…

Accelerated Profile HMM Searches

Profile hidden Markov models (profile HMMs) and probabilistic inference methods have made important contributions to the theory of sequence database homology search. However, practical use of profile HMM methods has been hindered by the computational expense of existing software implementations. Her...

Descripción completa

Detalles Bibliográficos
Autor principal: Eddy, Sean R.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3197634/
https://www.ncbi.nlm.nih.gov/pubmed/22039361
http://dx.doi.org/10.1371/journal.pcbi.1002195
_version_ 1782214344284569600
author Eddy, Sean R.
author_facet Eddy, Sean R.
author_sort Eddy, Sean R.
collection PubMed
description Profile hidden Markov models (profile HMMs) and probabilistic inference methods have made important contributions to the theory of sequence database homology search. However, practical use of profile HMM methods has been hindered by the computational expense of existing software implementations. Here I describe an acceleration heuristic for profile HMMs, the “multiple segment Viterbi” (MSV) algorithm. The MSV algorithm computes an optimal sum of multiple ungapped local alignment segments using a striped vector-parallel approach previously described for fast Smith/Waterman alignment. MSV scores follow the same statistical distribution as gapped optimal local alignment scores, allowing rapid evaluation of significance of an MSV score and thus facilitating its use as a heuristic filter. I also describe a 20-fold acceleration of the standard profile HMM Forward/Backward algorithms using a method I call “sparse rescaling”. These methods are assembled in a pipeline in which high-scoring MSV hits are passed on for reanalysis with the full HMM Forward/Backward algorithm. This accelerated pipeline is implemented in the freely available HMMER3 software package. Performance benchmarks show that the use of the heuristic MSV filter sacrifices negligible sensitivity compared to unaccelerated profile HMM searches. HMMER3 is substantially more sensitive and 100- to 1000-fold faster than HMMER2. HMMER3 is now about as fast as BLAST for protein searches.
format Online
Article
Text
id pubmed-3197634
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-31976342011-10-28 Accelerated Profile HMM Searches Eddy, Sean R. PLoS Comput Biol Research Article Profile hidden Markov models (profile HMMs) and probabilistic inference methods have made important contributions to the theory of sequence database homology search. However, practical use of profile HMM methods has been hindered by the computational expense of existing software implementations. Here I describe an acceleration heuristic for profile HMMs, the “multiple segment Viterbi” (MSV) algorithm. The MSV algorithm computes an optimal sum of multiple ungapped local alignment segments using a striped vector-parallel approach previously described for fast Smith/Waterman alignment. MSV scores follow the same statistical distribution as gapped optimal local alignment scores, allowing rapid evaluation of significance of an MSV score and thus facilitating its use as a heuristic filter. I also describe a 20-fold acceleration of the standard profile HMM Forward/Backward algorithms using a method I call “sparse rescaling”. These methods are assembled in a pipeline in which high-scoring MSV hits are passed on for reanalysis with the full HMM Forward/Backward algorithm. This accelerated pipeline is implemented in the freely available HMMER3 software package. Performance benchmarks show that the use of the heuristic MSV filter sacrifices negligible sensitivity compared to unaccelerated profile HMM searches. HMMER3 is substantially more sensitive and 100- to 1000-fold faster than HMMER2. HMMER3 is now about as fast as BLAST for protein searches. Public Library of Science 2011-10-20 /pmc/articles/PMC3197634/ /pubmed/22039361 http://dx.doi.org/10.1371/journal.pcbi.1002195 Text en Sean R. Eddy. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Eddy, Sean R.
Accelerated Profile HMM Searches
title Accelerated Profile HMM Searches
title_full Accelerated Profile HMM Searches
title_fullStr Accelerated Profile HMM Searches
title_full_unstemmed Accelerated Profile HMM Searches
title_short Accelerated Profile HMM Searches
title_sort accelerated profile hmm searches
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3197634/
https://www.ncbi.nlm.nih.gov/pubmed/22039361
http://dx.doi.org/10.1371/journal.pcbi.1002195
work_keys_str_mv AT eddyseanr acceleratedprofilehmmsearches