Cargando…

Instability in progressive multiple sequence alignment algorithms

BACKGROUND: Progressive alignment is the standard approach used to align large numbers of sequences. As with all heuristics, this involves a tradeoff between alignment accuracy and computation time. RESULTS: We examine this tradeoff and find that, because of a loss of information in the early steps...

Descripción completa

Detalles Bibliográficos
Autores principales: Boyce, Kieran, Sievers, Fabian, Higgins, Desmond G.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4599319/
https://www.ncbi.nlm.nih.gov/pubmed/26457114
http://dx.doi.org/10.1186/s13015-015-0057-1
Descripción
Sumario:BACKGROUND: Progressive alignment is the standard approach used to align large numbers of sequences. As with all heuristics, this involves a tradeoff between alignment accuracy and computation time. RESULTS: We examine this tradeoff and find that, because of a loss of information in the early steps of the approach, the alignments generated by the most common multiple sequence alignment programs are inherently unstable, and simply reversing the order of the sequences in the input file will cause a different alignment to be generated. Although this effect is more obvious with larger numbers of sequences, it can also be seen with data sets in the order of one hundred sequences. We also outline the means to determine the number of sequences in a data set beyond which the probability of instability will become more pronounced. CONCLUSIONS: This has major ramifications for both the designers of large-scale multiple sequence alignment algorithms, and for the users of these alignments.