Cargando…

Instability in progressive multiple sequence alignment algorithms

BACKGROUND: Progressive alignment is the standard approach used to align large numbers of sequences. As with all heuristics, this involves a tradeoff between alignment accuracy and computation time. RESULTS: We examine this tradeoff and find that, because of a loss of information in the early steps...

Descripción completa

Detalles Bibliográficos
Autores principales: Boyce, Kieran, Sievers, Fabian, Higgins, Desmond G.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4599319/
https://www.ncbi.nlm.nih.gov/pubmed/26457114
http://dx.doi.org/10.1186/s13015-015-0057-1
_version_ 1782394231494541312
author Boyce, Kieran
Sievers, Fabian
Higgins, Desmond G.
author_facet Boyce, Kieran
Sievers, Fabian
Higgins, Desmond G.
author_sort Boyce, Kieran
collection PubMed
description BACKGROUND: Progressive alignment is the standard approach used to align large numbers of sequences. As with all heuristics, this involves a tradeoff between alignment accuracy and computation time. RESULTS: We examine this tradeoff and find that, because of a loss of information in the early steps of the approach, the alignments generated by the most common multiple sequence alignment programs are inherently unstable, and simply reversing the order of the sequences in the input file will cause a different alignment to be generated. Although this effect is more obvious with larger numbers of sequences, it can also be seen with data sets in the order of one hundred sequences. We also outline the means to determine the number of sequences in a data set beyond which the probability of instability will become more pronounced. CONCLUSIONS: This has major ramifications for both the designers of large-scale multiple sequence alignment algorithms, and for the users of these alignments.
format Online
Article
Text
id pubmed-4599319
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-45993192015-10-10 Instability in progressive multiple sequence alignment algorithms Boyce, Kieran Sievers, Fabian Higgins, Desmond G. Algorithms Mol Biol Research BACKGROUND: Progressive alignment is the standard approach used to align large numbers of sequences. As with all heuristics, this involves a tradeoff between alignment accuracy and computation time. RESULTS: We examine this tradeoff and find that, because of a loss of information in the early steps of the approach, the alignments generated by the most common multiple sequence alignment programs are inherently unstable, and simply reversing the order of the sequences in the input file will cause a different alignment to be generated. Although this effect is more obvious with larger numbers of sequences, it can also be seen with data sets in the order of one hundred sequences. We also outline the means to determine the number of sequences in a data set beyond which the probability of instability will become more pronounced. CONCLUSIONS: This has major ramifications for both the designers of large-scale multiple sequence alignment algorithms, and for the users of these alignments. BioMed Central 2015-10-09 /pmc/articles/PMC4599319/ /pubmed/26457114 http://dx.doi.org/10.1186/s13015-015-0057-1 Text en © Boyce et al. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Boyce, Kieran
Sievers, Fabian
Higgins, Desmond G.
Instability in progressive multiple sequence alignment algorithms
title Instability in progressive multiple sequence alignment algorithms
title_full Instability in progressive multiple sequence alignment algorithms
title_fullStr Instability in progressive multiple sequence alignment algorithms
title_full_unstemmed Instability in progressive multiple sequence alignment algorithms
title_short Instability in progressive multiple sequence alignment algorithms
title_sort instability in progressive multiple sequence alignment algorithms
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4599319/
https://www.ncbi.nlm.nih.gov/pubmed/26457114
http://dx.doi.org/10.1186/s13015-015-0057-1
work_keys_str_mv AT boycekieran instabilityinprogressivemultiplesequencealignmentalgorithms
AT sieversfabian instabilityinprogressivemultiplesequencealignmentalgorithms
AT higginsdesmondg instabilityinprogressivemultiplesequencealignmentalgorithms