Cargando…

HMMerge: an ensemble method for multiple sequence alignment

MOTIVATION: Despite advances in method development for multiple sequence alignment over the last several decades, the alignment of datasets exhibiting substantial sequence length heterogeneity, especially when the input sequences include very short sequences (either as a result of sequencing technol...

Descripción completa

Detalles Bibliográficos
Autores principales: Park, Minhyuk, Warnow, Tandy
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10148686/
https://www.ncbi.nlm.nih.gov/pubmed/37128578
http://dx.doi.org/10.1093/bioadv/vbad052
_version_ 1785035026344181760
author Park, Minhyuk
Warnow, Tandy
author_facet Park, Minhyuk
Warnow, Tandy
author_sort Park, Minhyuk
collection PubMed
description MOTIVATION: Despite advances in method development for multiple sequence alignment over the last several decades, the alignment of datasets exhibiting substantial sequence length heterogeneity, especially when the input sequences include very short sequences (either as a result of sequencing technologies or of large deletions during evolution) remains an inadequately solved problem. RESULTS: We present HMMerge, a method to compute an alignment of datasets exhibiting high sequence length heterogeneity, or to add short sequences into a given ‘backbone’ alignment. HMMerge builds on the technique from its predecessor alignment methods, UPP and WITCH, which build an ensemble of profile HMMs to represent the backbone alignment and add the remaining sequences into the backbone alignment using the ensemble. HMMerge differs from UPP and WITCH by building a new ‘merged’ HMM from the ensemble, and then using that merged HMM to align the query sequences. We show that HMMerge is competitive with WITCH, with an advantage over WITCH when adding very short sequences into backbone alignments. AVAILABILITY AND IMPLEMENTATION: HMMerge is freely available at https://github.com/MinhyukPark/HMMerge. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online.
format Online
Article
Text
id pubmed-10148686
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-101486862023-04-30 HMMerge: an ensemble method for multiple sequence alignment Park, Minhyuk Warnow, Tandy Bioinform Adv Original Paper MOTIVATION: Despite advances in method development for multiple sequence alignment over the last several decades, the alignment of datasets exhibiting substantial sequence length heterogeneity, especially when the input sequences include very short sequences (either as a result of sequencing technologies or of large deletions during evolution) remains an inadequately solved problem. RESULTS: We present HMMerge, a method to compute an alignment of datasets exhibiting high sequence length heterogeneity, or to add short sequences into a given ‘backbone’ alignment. HMMerge builds on the technique from its predecessor alignment methods, UPP and WITCH, which build an ensemble of profile HMMs to represent the backbone alignment and add the remaining sequences into the backbone alignment using the ensemble. HMMerge differs from UPP and WITCH by building a new ‘merged’ HMM from the ensemble, and then using that merged HMM to align the query sequences. We show that HMMerge is competitive with WITCH, with an advantage over WITCH when adding very short sequences into backbone alignments. AVAILABILITY AND IMPLEMENTATION: HMMerge is freely available at https://github.com/MinhyukPark/HMMerge. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online. Oxford University Press 2023-04-17 /pmc/articles/PMC10148686/ /pubmed/37128578 http://dx.doi.org/10.1093/bioadv/vbad052 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Park, Minhyuk
Warnow, Tandy
HMMerge: an ensemble method for multiple sequence alignment
title HMMerge: an ensemble method for multiple sequence alignment
title_full HMMerge: an ensemble method for multiple sequence alignment
title_fullStr HMMerge: an ensemble method for multiple sequence alignment
title_full_unstemmed HMMerge: an ensemble method for multiple sequence alignment
title_short HMMerge: an ensemble method for multiple sequence alignment
title_sort hmmerge: an ensemble method for multiple sequence alignment
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10148686/
https://www.ncbi.nlm.nih.gov/pubmed/37128578
http://dx.doi.org/10.1093/bioadv/vbad052
work_keys_str_mv AT parkminhyuk hmmergeanensemblemethodformultiplesequencealignment
AT warnowtandy hmmergeanensemblemethodformultiplesequencealignment