Cargando…

Hidden Markov model speed heuristic and iterative HMM search procedure

BACKGROUND: Profile hidden Markov models (profile-HMMs) are sensitive tools for remote protein homology detection, but the main scoring algorithms, Viterbi or Forward, require considerable time to search large sequence databases. RESULTS: We have designed a series of database filtering steps, HMMERH...

Descripción completa

Detalles Bibliográficos
Autores principales: Johnson, L Steven, Eddy, Sean R, Portugaly, Elon
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2931519/
https://www.ncbi.nlm.nih.gov/pubmed/20718988
http://dx.doi.org/10.1186/1471-2105-11-431
_version_ 1782186056380055552
author Johnson, L Steven
Eddy, Sean R
Portugaly, Elon
author_facet Johnson, L Steven
Eddy, Sean R
Portugaly, Elon
author_sort Johnson, L Steven
collection PubMed
description BACKGROUND: Profile hidden Markov models (profile-HMMs) are sensitive tools for remote protein homology detection, but the main scoring algorithms, Viterbi or Forward, require considerable time to search large sequence databases. RESULTS: We have designed a series of database filtering steps, HMMERHEAD, that are applied prior to the scoring algorithms, as implemented in the HMMER package, in an effort to reduce search time. Using this heuristic, we obtain a 20-fold decrease in Forward and a 6-fold decrease in Viterbi search time with a minimal loss in sensitivity relative to the unfiltered approaches. We then implemented an iterative profile-HMM search method, JackHMMER, which employs the HMMERHEAD heuristic. Due to our search heuristic, we eliminated the subdatabase creation that is common in current iterative profile-HMM approaches. On our benchmark, JackHMMER detects 14% more remote protein homologs than SAM's iterative method T2K. CONCLUSIONS: Our search heuristic, HMMERHEAD, significantly reduces the time needed to score a profile-HMM against large sequence databases. This search heuristic allowed us to implement an iterative profile-HMM search method, JackHMMER, which detects significantly more remote protein homologs than SAM's T2K and NCBI's PSI-BLAST.
format Text
id pubmed-2931519
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-29315192010-09-02 Hidden Markov model speed heuristic and iterative HMM search procedure Johnson, L Steven Eddy, Sean R Portugaly, Elon BMC Bioinformatics Research Article BACKGROUND: Profile hidden Markov models (profile-HMMs) are sensitive tools for remote protein homology detection, but the main scoring algorithms, Viterbi or Forward, require considerable time to search large sequence databases. RESULTS: We have designed a series of database filtering steps, HMMERHEAD, that are applied prior to the scoring algorithms, as implemented in the HMMER package, in an effort to reduce search time. Using this heuristic, we obtain a 20-fold decrease in Forward and a 6-fold decrease in Viterbi search time with a minimal loss in sensitivity relative to the unfiltered approaches. We then implemented an iterative profile-HMM search method, JackHMMER, which employs the HMMERHEAD heuristic. Due to our search heuristic, we eliminated the subdatabase creation that is common in current iterative profile-HMM approaches. On our benchmark, JackHMMER detects 14% more remote protein homologs than SAM's iterative method T2K. CONCLUSIONS: Our search heuristic, HMMERHEAD, significantly reduces the time needed to score a profile-HMM against large sequence databases. This search heuristic allowed us to implement an iterative profile-HMM search method, JackHMMER, which detects significantly more remote protein homologs than SAM's T2K and NCBI's PSI-BLAST. BioMed Central 2010-08-18 /pmc/articles/PMC2931519/ /pubmed/20718988 http://dx.doi.org/10.1186/1471-2105-11-431 Text en Copyright ©2010 Johnson et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Johnson, L Steven
Eddy, Sean R
Portugaly, Elon
Hidden Markov model speed heuristic and iterative HMM search procedure
title Hidden Markov model speed heuristic and iterative HMM search procedure
title_full Hidden Markov model speed heuristic and iterative HMM search procedure
title_fullStr Hidden Markov model speed heuristic and iterative HMM search procedure
title_full_unstemmed Hidden Markov model speed heuristic and iterative HMM search procedure
title_short Hidden Markov model speed heuristic and iterative HMM search procedure
title_sort hidden markov model speed heuristic and iterative hmm search procedure
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2931519/
https://www.ncbi.nlm.nih.gov/pubmed/20718988
http://dx.doi.org/10.1186/1471-2105-11-431
work_keys_str_mv AT johnsonlsteven hiddenmarkovmodelspeedheuristicanditerativehmmsearchprocedure
AT eddyseanr hiddenmarkovmodelspeedheuristicanditerativehmmsearchprocedure
AT portugalyelon hiddenmarkovmodelspeedheuristicanditerativehmmsearchprocedure