Cargando…

Modular and configurable optimal sequence alignment software: Cola

BACKGROUND: The fundamental challenge in optimally aligning homologous sequences is to define a scoring scheme that best reflects the underlying biological processes. Maximising the overall number of matches in the alignment does not always reflect the patterns by which nucleotides mutate. Efficient...

Descripción completa

Detalles Bibliográficos
Autores principales: Zamani, Neda, Sundström, Görel, Höppner, Marc P, Grabherr, Manfred G
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4064277/
https://www.ncbi.nlm.nih.gov/pubmed/24976859
http://dx.doi.org/10.1186/1751-0473-9-12
_version_ 1782321932630228992
author Zamani, Neda
Sundström, Görel
Höppner, Marc P
Grabherr, Manfred G
author_facet Zamani, Neda
Sundström, Görel
Höppner, Marc P
Grabherr, Manfred G
author_sort Zamani, Neda
collection PubMed
description BACKGROUND: The fundamental challenge in optimally aligning homologous sequences is to define a scoring scheme that best reflects the underlying biological processes. Maximising the overall number of matches in the alignment does not always reflect the patterns by which nucleotides mutate. Efficiently implemented algorithms that can be parameterised to accommodate more complex non-linear scoring schemes are thus desirable. RESULTS: We present Cola, alignment software that implements different optimal alignment algorithms, also allowing for scoring contiguous matches of nucleotides in a nonlinear manner. The latter places more emphasis on short, highly conserved motifs, and less on the surrounding nucleotides, which can be more diverged. To illustrate the differences, we report results from aligning 14,100 sequences from 3' untranslated regions of human genes to 25 of their mammalian counterparts, where we found that a nonlinear scoring scheme is more consistent than a linear scheme in detecting short, conserved motifs. CONCLUSIONS: Cola is freely available under LPGL from https://github.com/nedaz/cola.
format Online
Article
Text
id pubmed-4064277
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-40642772014-06-27 Modular and configurable optimal sequence alignment software: Cola Zamani, Neda Sundström, Görel Höppner, Marc P Grabherr, Manfred G Source Code Biol Med Brief Reports BACKGROUND: The fundamental challenge in optimally aligning homologous sequences is to define a scoring scheme that best reflects the underlying biological processes. Maximising the overall number of matches in the alignment does not always reflect the patterns by which nucleotides mutate. Efficiently implemented algorithms that can be parameterised to accommodate more complex non-linear scoring schemes are thus desirable. RESULTS: We present Cola, alignment software that implements different optimal alignment algorithms, also allowing for scoring contiguous matches of nucleotides in a nonlinear manner. The latter places more emphasis on short, highly conserved motifs, and less on the surrounding nucleotides, which can be more diverged. To illustrate the differences, we report results from aligning 14,100 sequences from 3' untranslated regions of human genes to 25 of their mammalian counterparts, where we found that a nonlinear scoring scheme is more consistent than a linear scheme in detecting short, conserved motifs. CONCLUSIONS: Cola is freely available under LPGL from https://github.com/nedaz/cola. BioMed Central 2014-06-09 /pmc/articles/PMC4064277/ /pubmed/24976859 http://dx.doi.org/10.1186/1751-0473-9-12 Text en Copyright © 2014 Zamani et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Brief Reports
Zamani, Neda
Sundström, Görel
Höppner, Marc P
Grabherr, Manfred G
Modular and configurable optimal sequence alignment software: Cola
title Modular and configurable optimal sequence alignment software: Cola
title_full Modular and configurable optimal sequence alignment software: Cola
title_fullStr Modular and configurable optimal sequence alignment software: Cola
title_full_unstemmed Modular and configurable optimal sequence alignment software: Cola
title_short Modular and configurable optimal sequence alignment software: Cola
title_sort modular and configurable optimal sequence alignment software: cola
topic Brief Reports
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4064277/
https://www.ncbi.nlm.nih.gov/pubmed/24976859
http://dx.doi.org/10.1186/1751-0473-9-12
work_keys_str_mv AT zamanineda modularandconfigurableoptimalsequencealignmentsoftwarecola
AT sundstromgorel modularandconfigurableoptimalsequencealignmentsoftwarecola
AT hoppnermarcp modularandconfigurableoptimalsequencealignmentsoftwarecola
AT grabherrmanfredg modularandconfigurableoptimalsequencealignmentsoftwarecola