Cargando…
Instability in progressive multiple sequence alignment algorithms
BACKGROUND: Progressive alignment is the standard approach used to align large numbers of sequences. As with all heuristics, this involves a tradeoff between alignment accuracy and computation time. RESULTS: We examine this tradeoff and find that, because of a loss of information in the early steps...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4599319/ https://www.ncbi.nlm.nih.gov/pubmed/26457114 http://dx.doi.org/10.1186/s13015-015-0057-1 |
_version_ | 1782394231494541312 |
---|---|
author | Boyce, Kieran Sievers, Fabian Higgins, Desmond G. |
author_facet | Boyce, Kieran Sievers, Fabian Higgins, Desmond G. |
author_sort | Boyce, Kieran |
collection | PubMed |
description | BACKGROUND: Progressive alignment is the standard approach used to align large numbers of sequences. As with all heuristics, this involves a tradeoff between alignment accuracy and computation time. RESULTS: We examine this tradeoff and find that, because of a loss of information in the early steps of the approach, the alignments generated by the most common multiple sequence alignment programs are inherently unstable, and simply reversing the order of the sequences in the input file will cause a different alignment to be generated. Although this effect is more obvious with larger numbers of sequences, it can also be seen with data sets in the order of one hundred sequences. We also outline the means to determine the number of sequences in a data set beyond which the probability of instability will become more pronounced. CONCLUSIONS: This has major ramifications for both the designers of large-scale multiple sequence alignment algorithms, and for the users of these alignments. |
format | Online Article Text |
id | pubmed-4599319 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-45993192015-10-10 Instability in progressive multiple sequence alignment algorithms Boyce, Kieran Sievers, Fabian Higgins, Desmond G. Algorithms Mol Biol Research BACKGROUND: Progressive alignment is the standard approach used to align large numbers of sequences. As with all heuristics, this involves a tradeoff between alignment accuracy and computation time. RESULTS: We examine this tradeoff and find that, because of a loss of information in the early steps of the approach, the alignments generated by the most common multiple sequence alignment programs are inherently unstable, and simply reversing the order of the sequences in the input file will cause a different alignment to be generated. Although this effect is more obvious with larger numbers of sequences, it can also be seen with data sets in the order of one hundred sequences. We also outline the means to determine the number of sequences in a data set beyond which the probability of instability will become more pronounced. CONCLUSIONS: This has major ramifications for both the designers of large-scale multiple sequence alignment algorithms, and for the users of these alignments. BioMed Central 2015-10-09 /pmc/articles/PMC4599319/ /pubmed/26457114 http://dx.doi.org/10.1186/s13015-015-0057-1 Text en © Boyce et al. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Boyce, Kieran Sievers, Fabian Higgins, Desmond G. Instability in progressive multiple sequence alignment algorithms |
title | Instability in progressive multiple sequence alignment algorithms |
title_full | Instability in progressive multiple sequence alignment algorithms |
title_fullStr | Instability in progressive multiple sequence alignment algorithms |
title_full_unstemmed | Instability in progressive multiple sequence alignment algorithms |
title_short | Instability in progressive multiple sequence alignment algorithms |
title_sort | instability in progressive multiple sequence alignment algorithms |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4599319/ https://www.ncbi.nlm.nih.gov/pubmed/26457114 http://dx.doi.org/10.1186/s13015-015-0057-1 |
work_keys_str_mv | AT boycekieran instabilityinprogressivemultiplesequencealignmentalgorithms AT sieversfabian instabilityinprogressivemultiplesequencealignmentalgorithms AT higginsdesmondg instabilityinprogressivemultiplesequencealignmentalgorithms |