Cargando…

Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions

Detection of protein homology via sequence similarity has important applications in biology, from protein structure and function prediction to reconstruction of phylogenies. Although current methods for aligning protein sequences are powerful, challenges remain, including problems with homologous ov...

Descripción completa

Detalles Bibliográficos
Autores principales: Mistry, Jaina, Finn, Robert D., Eddy, Sean R., Bateman, Alex, Punta, Marco
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3695513/
https://www.ncbi.nlm.nih.gov/pubmed/23598997
http://dx.doi.org/10.1093/nar/gkt263
_version_ 1782274983532167168
author Mistry, Jaina
Finn, Robert D.
Eddy, Sean R.
Bateman, Alex
Punta, Marco
author_facet Mistry, Jaina
Finn, Robert D.
Eddy, Sean R.
Bateman, Alex
Punta, Marco
author_sort Mistry, Jaina
collection PubMed
description Detection of protein homology via sequence similarity has important applications in biology, from protein structure and function prediction to reconstruction of phylogenies. Although current methods for aligning protein sequences are powerful, challenges remain, including problems with homologous overextension of alignments and with regions under convergent evolution. Here, we test the ability of the profile hidden Markov model method HMMER3 to correctly assign homologous sequences to >13 000 manually curated families from the Pfam database. We identify problem families using protein regions that match two or more Pfam families not currently annotated as related in Pfam. We find that HMMER3 E-value estimates seem to be less accurate for families that feature periodic patterns of compositional bias, such as the ones typically observed in coiled-coils. These results support the continued use of manually curated inclusion thresholds in the Pfam database, especially on the subset of families that have been identified as problematic in experiments such as these. They also highlight the need for developing new methods that can correct for this particular type of compositional bias.
format Online
Article
Text
id pubmed-3695513
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-36955132013-06-28 Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions Mistry, Jaina Finn, Robert D. Eddy, Sean R. Bateman, Alex Punta, Marco Nucleic Acids Res Methods Online Detection of protein homology via sequence similarity has important applications in biology, from protein structure and function prediction to reconstruction of phylogenies. Although current methods for aligning protein sequences are powerful, challenges remain, including problems with homologous overextension of alignments and with regions under convergent evolution. Here, we test the ability of the profile hidden Markov model method HMMER3 to correctly assign homologous sequences to >13 000 manually curated families from the Pfam database. We identify problem families using protein regions that match two or more Pfam families not currently annotated as related in Pfam. We find that HMMER3 E-value estimates seem to be less accurate for families that feature periodic patterns of compositional bias, such as the ones typically observed in coiled-coils. These results support the continued use of manually curated inclusion thresholds in the Pfam database, especially on the subset of families that have been identified as problematic in experiments such as these. They also highlight the need for developing new methods that can correct for this particular type of compositional bias. Oxford University Press 2013-07 2013-04-17 /pmc/articles/PMC3695513/ /pubmed/23598997 http://dx.doi.org/10.1093/nar/gkt263 Text en © The Author(s) 2013. Published by Oxford University Press. http://creativecommons.org/licenses/by/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Online
Mistry, Jaina
Finn, Robert D.
Eddy, Sean R.
Bateman, Alex
Punta, Marco
Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions
title Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions
title_full Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions
title_fullStr Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions
title_full_unstemmed Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions
title_short Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions
title_sort challenges in homology search: hmmer3 and convergent evolution of coiled-coil regions
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3695513/
https://www.ncbi.nlm.nih.gov/pubmed/23598997
http://dx.doi.org/10.1093/nar/gkt263
work_keys_str_mv AT mistryjaina challengesinhomologysearchhmmer3andconvergentevolutionofcoiledcoilregions
AT finnrobertd challengesinhomologysearchhmmer3andconvergentevolutionofcoiledcoilregions
AT eddyseanr challengesinhomologysearchhmmer3andconvergentevolutionofcoiledcoilregions
AT batemanalex challengesinhomologysearchhmmer3andconvergentevolutionofcoiledcoilregions
AT puntamarco challengesinhomologysearchhmmer3andconvergentevolutionofcoiledcoilregions