Cargando…

Multiple sequence alignments of partially coding nucleic acid sequences

BACKGROUND: High quality sequence alignments of RNA and DNA sequences are an important prerequisite for the comparative analysis of genomic sequence data. Nucleic acid sequences, however, exhibit a much larger sequence heterogeneity compared to their encoded protein sequences due to the redundancy o...

Descripción completa

Detalles Bibliográficos
Autores principales:	Stocsits, Roman R, Hofacker, Ivo L, Fried, Claudia, Stadler, Peter F
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2005
Materias:	Software
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1182351/ https://www.ncbi.nlm.nih.gov/pubmed/15985156 http://dx.doi.org/10.1186/1471-2105-6-160

_version_	1782124656619159552
author	Stocsits, Roman R Hofacker, Ivo L Fried, Claudia Stadler, Peter F
author_facet	Stocsits, Roman R Hofacker, Ivo L Fried, Claudia Stadler, Peter F
author_sort	Stocsits, Roman R
collection	PubMed
description	BACKGROUND: High quality sequence alignments of RNA and DNA sequences are an important prerequisite for the comparative analysis of genomic sequence data. Nucleic acid sequences, however, exhibit a much larger sequence heterogeneity compared to their encoded protein sequences due to the redundancy of the genetic code. It is desirable, therefore, to make use of the amino acid sequence when aligning coding nucleic acid sequences. In many cases, however, only a part of the sequence of interest is translated. On the other hand, overlapping reading frames may encode multiple alternative proteins, possibly with intermittent non-coding parts. Examples are, in particular, RNA virus genomes. RESULTS: The standard scoring scheme for nucleic acid alignments can be extended to incorporate simultaneously information on translation products in one or more reading frames. Here we present a multiple alignment tool, codaln, that implements a combined nucleic acid plus amino acid scoring model for pairwise and progressive multiple alignments that allows arbitrary weighting for almost all scoring parameters. Resource requirements of codaln are comparable with those of standard tools such as ClustalW. CONCLUSION: We demonstrate the applicability of codaln to various biologically relevant types of sequences (bacteriophage Levivirus and Vertebrate Hox clusters) and show that the combination of nucleic acid and amino acid sequence information leads to improved alignments. These, in turn, increase the performance of analysis tools that depend strictly on good input alignments such as methods for detecting conserved RNA secondary structure elements.
format	Text
id	pubmed-1182351
institution	National Center for Biotechnology Information
language	English
publishDate	2005
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-11823512005-08-04 Multiple sequence alignments of partially coding nucleic acid sequences Stocsits, Roman R Hofacker, Ivo L Fried, Claudia Stadler, Peter F BMC Bioinformatics Software BACKGROUND: High quality sequence alignments of RNA and DNA sequences are an important prerequisite for the comparative analysis of genomic sequence data. Nucleic acid sequences, however, exhibit a much larger sequence heterogeneity compared to their encoded protein sequences due to the redundancy of the genetic code. It is desirable, therefore, to make use of the amino acid sequence when aligning coding nucleic acid sequences. In many cases, however, only a part of the sequence of interest is translated. On the other hand, overlapping reading frames may encode multiple alternative proteins, possibly with intermittent non-coding parts. Examples are, in particular, RNA virus genomes. RESULTS: The standard scoring scheme for nucleic acid alignments can be extended to incorporate simultaneously information on translation products in one or more reading frames. Here we present a multiple alignment tool, codaln, that implements a combined nucleic acid plus amino acid scoring model for pairwise and progressive multiple alignments that allows arbitrary weighting for almost all scoring parameters. Resource requirements of codaln are comparable with those of standard tools such as ClustalW. CONCLUSION: We demonstrate the applicability of codaln to various biologically relevant types of sequences (bacteriophage Levivirus and Vertebrate Hox clusters) and show that the combination of nucleic acid and amino acid sequence information leads to improved alignments. These, in turn, increase the performance of analysis tools that depend strictly on good input alignments such as methods for detecting conserved RNA secondary structure elements. BioMed Central 2005-06-28 /pmc/articles/PMC1182351/ /pubmed/15985156 http://dx.doi.org/10.1186/1471-2105-6-160 Text en Copyright © 2005 Stocsits et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Software Stocsits, Roman R Hofacker, Ivo L Fried, Claudia Stadler, Peter F Multiple sequence alignments of partially coding nucleic acid sequences
title	Multiple sequence alignments of partially coding nucleic acid sequences
title_full	Multiple sequence alignments of partially coding nucleic acid sequences
title_fullStr	Multiple sequence alignments of partially coding nucleic acid sequences
title_full_unstemmed	Multiple sequence alignments of partially coding nucleic acid sequences
title_short	Multiple sequence alignments of partially coding nucleic acid sequences
title_sort	multiple sequence alignments of partially coding nucleic acid sequences
topic	Software
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1182351/ https://www.ncbi.nlm.nih.gov/pubmed/15985156 http://dx.doi.org/10.1186/1471-2105-6-160
work_keys_str_mv	AT stocsitsromanr multiplesequencealignmentsofpartiallycodingnucleicacidsequences AT hofackerivol multiplesequencealignmentsofpartiallycodingnucleicacidsequences AT friedclaudia multiplesequencealignmentsofpartiallycodingnucleicacidsequences AT stadlerpeterf multiplesequencealignmentsofpartiallycodingnucleicacidsequences

Multiple sequence alignments of partially coding nucleic acid sequences

Ejemplares similares