Cargando…
Progressive multiple sequence alignments from triplets
BACKGROUND: The quality of progressive sequence alignments strongly depends on the accuracy of the individual pairwise alignment steps since gaps that are introduced at one step cannot be removed at later aggregation steps. Adjacent insertions and deletions necessarily appear in arbitrary order in p...
Autores principales: | , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2007
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1948021/ https://www.ncbi.nlm.nih.gov/pubmed/17631683 http://dx.doi.org/10.1186/1471-2105-8-254 |
_version_ | 1782134506337075200 |
---|---|
author | Kruspe, Matthias Stadler, Peter F |
author_facet | Kruspe, Matthias Stadler, Peter F |
author_sort | Kruspe, Matthias |
collection | PubMed |
description | BACKGROUND: The quality of progressive sequence alignments strongly depends on the accuracy of the individual pairwise alignment steps since gaps that are introduced at one step cannot be removed at later aggregation steps. Adjacent insertions and deletions necessarily appear in arbitrary order in pairwise alignments and hence form an unavoidable source of errors. RESEARCH: Here we present a modified variant of progressive sequence alignments that addresses both issues. Instead of pairwise alignments we use exact dynamic programming to align sequence or profile triples. This avoids a large fractions of the ambiguities arising in pairwise alignments. In the subsequent aggregation steps we follow the logic of the Neighbor-Net algorithm, which constructs a phylogenetic network by step-wisely replacing triples by pairs instead of combining pairs to singletons. To this end the three-way alignments are subdivided into two partial alignments, at which stage all-gap columns are naturally removed. This alleviates the "once a gap, always a gap" problem of progressive alignment procedures. CONCLUSION: The three-way Neighbor-Net based alignment program aln3nn is shown to compare favorably on both protein sequences and nucleic acids sequences to other progressive alignment tools. In the latter case one easily can include scoring terms that consider secondary structure features. Overall, the quality of resulting alignments in general exceeds that of clustalw or other multiple alignments tools even though our software does not included heuristics for context dependent (mis)match scores. |
format | Text |
id | pubmed-1948021 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2007 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-19480212007-08-14 Progressive multiple sequence alignments from triplets Kruspe, Matthias Stadler, Peter F BMC Bioinformatics Methodology Article BACKGROUND: The quality of progressive sequence alignments strongly depends on the accuracy of the individual pairwise alignment steps since gaps that are introduced at one step cannot be removed at later aggregation steps. Adjacent insertions and deletions necessarily appear in arbitrary order in pairwise alignments and hence form an unavoidable source of errors. RESEARCH: Here we present a modified variant of progressive sequence alignments that addresses both issues. Instead of pairwise alignments we use exact dynamic programming to align sequence or profile triples. This avoids a large fractions of the ambiguities arising in pairwise alignments. In the subsequent aggregation steps we follow the logic of the Neighbor-Net algorithm, which constructs a phylogenetic network by step-wisely replacing triples by pairs instead of combining pairs to singletons. To this end the three-way alignments are subdivided into two partial alignments, at which stage all-gap columns are naturally removed. This alleviates the "once a gap, always a gap" problem of progressive alignment procedures. CONCLUSION: The three-way Neighbor-Net based alignment program aln3nn is shown to compare favorably on both protein sequences and nucleic acids sequences to other progressive alignment tools. In the latter case one easily can include scoring terms that consider secondary structure features. Overall, the quality of resulting alignments in general exceeds that of clustalw or other multiple alignments tools even though our software does not included heuristics for context dependent (mis)match scores. BioMed Central 2007-07-15 /pmc/articles/PMC1948021/ /pubmed/17631683 http://dx.doi.org/10.1186/1471-2105-8-254 Text en Copyright © 2007 Kruspe and Stadler; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methodology Article Kruspe, Matthias Stadler, Peter F Progressive multiple sequence alignments from triplets |
title | Progressive multiple sequence alignments from triplets |
title_full | Progressive multiple sequence alignments from triplets |
title_fullStr | Progressive multiple sequence alignments from triplets |
title_full_unstemmed | Progressive multiple sequence alignments from triplets |
title_short | Progressive multiple sequence alignments from triplets |
title_sort | progressive multiple sequence alignments from triplets |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1948021/ https://www.ncbi.nlm.nih.gov/pubmed/17631683 http://dx.doi.org/10.1186/1471-2105-8-254 |
work_keys_str_mv | AT kruspematthias progressivemultiplesequencealignmentsfromtriplets AT stadlerpeterf progressivemultiplesequencealignmentsfromtriplets |