Cargando…

Sequence similarity is more relevant than species specificity in probabilistic backtranslation

BACKGROUND: Backtranslation is the process of decoding a sequence of amino acids into the corresponding codons. All synthetic gene design systems include a backtranslation module. The degeneracy of the genetic code makes backtranslation potentially ambiguous since most amino acids are encoded by mul...

Descripción completa

Detalles Bibliográficos
Autores principales:	Ferro, Alfredo, Giugno, Rosalba, Pigola, Giuseppe, Pulvirenti, Alfredo, Di Pietro, Cinzia, Purrello, Michele, Ragusa, Marco
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2007
Materias:	Software
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1810562/ https://www.ncbi.nlm.nih.gov/pubmed/17313665 http://dx.doi.org/10.1186/1471-2105-8-58

_version_	1782132603207286784
author	Ferro, Alfredo Giugno, Rosalba Pigola, Giuseppe Pulvirenti, Alfredo Di Pietro, Cinzia Purrello, Michele Ragusa, Marco
author_facet	Ferro, Alfredo Giugno, Rosalba Pigola, Giuseppe Pulvirenti, Alfredo Di Pietro, Cinzia Purrello, Michele Ragusa, Marco
author_sort	Ferro, Alfredo
collection	PubMed
description	BACKGROUND: Backtranslation is the process of decoding a sequence of amino acids into the corresponding codons. All synthetic gene design systems include a backtranslation module. The degeneracy of the genetic code makes backtranslation potentially ambiguous since most amino acids are encoded by multiple codons. The common approach to overcome this difficulty is based on imitation of codon usage within the target species. RESULTS: This paper describes EasyBack, a new parameter-free, fully-automated software for backtranslation using Hidden Markov Models. EasyBack is not based on imitation of codon usage within the target species, but instead uses a sequence-similarity criterion. The model is trained with a set of proteins with known cDNA coding sequences, constructed from the input protein by querying the NCBI databases with BLAST. Unlike existing software, the proposed method allows the quality of prediction to be estimated. When tested on a group of proteins that show different degrees of sequence conservation, EasyBack outperforms other published methods in terms of precision. CONCLUSION: The prediction quality of a protein backtranslation methis markedly increased by replacing the criterion of most used codon in the same species with a Hidden Markov Model trained with a set of most similar sequences from all species. Moreover, the proposed method allows the quality of prediction to be estimated probabilistically.
format	Text
id	pubmed-1810562
institution	National Center for Biotechnology Information
language	English
publishDate	2007
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-18105622007-03-13 Sequence similarity is more relevant than species specificity in probabilistic backtranslation Ferro, Alfredo Giugno, Rosalba Pigola, Giuseppe Pulvirenti, Alfredo Di Pietro, Cinzia Purrello, Michele Ragusa, Marco BMC Bioinformatics Software BACKGROUND: Backtranslation is the process of decoding a sequence of amino acids into the corresponding codons. All synthetic gene design systems include a backtranslation module. The degeneracy of the genetic code makes backtranslation potentially ambiguous since most amino acids are encoded by multiple codons. The common approach to overcome this difficulty is based on imitation of codon usage within the target species. RESULTS: This paper describes EasyBack, a new parameter-free, fully-automated software for backtranslation using Hidden Markov Models. EasyBack is not based on imitation of codon usage within the target species, but instead uses a sequence-similarity criterion. The model is trained with a set of proteins with known cDNA coding sequences, constructed from the input protein by querying the NCBI databases with BLAST. Unlike existing software, the proposed method allows the quality of prediction to be estimated. When tested on a group of proteins that show different degrees of sequence conservation, EasyBack outperforms other published methods in terms of precision. CONCLUSION: The prediction quality of a protein backtranslation methis markedly increased by replacing the criterion of most used codon in the same species with a Hidden Markov Model trained with a set of most similar sequences from all species. Moreover, the proposed method allows the quality of prediction to be estimated probabilistically. BioMed Central 2007-02-21 /pmc/articles/PMC1810562/ /pubmed/17313665 http://dx.doi.org/10.1186/1471-2105-8-58 Text en Copyright © 2007 Ferro et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Software Ferro, Alfredo Giugno, Rosalba Pigola, Giuseppe Pulvirenti, Alfredo Di Pietro, Cinzia Purrello, Michele Ragusa, Marco Sequence similarity is more relevant than species specificity in probabilistic backtranslation
title	Sequence similarity is more relevant than species specificity in probabilistic backtranslation
title_full	Sequence similarity is more relevant than species specificity in probabilistic backtranslation
title_fullStr	Sequence similarity is more relevant than species specificity in probabilistic backtranslation
title_full_unstemmed	Sequence similarity is more relevant than species specificity in probabilistic backtranslation
title_short	Sequence similarity is more relevant than species specificity in probabilistic backtranslation
title_sort	sequence similarity is more relevant than species specificity in probabilistic backtranslation
topic	Software
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1810562/ https://www.ncbi.nlm.nih.gov/pubmed/17313665 http://dx.doi.org/10.1186/1471-2105-8-58
work_keys_str_mv	AT ferroalfredo sequencesimilarityismorerelevantthanspeciesspecificityinprobabilisticbacktranslation AT giugnorosalba sequencesimilarityismorerelevantthanspeciesspecificityinprobabilisticbacktranslation AT pigolagiuseppe sequencesimilarityismorerelevantthanspeciesspecificityinprobabilisticbacktranslation AT pulvirentialfredo sequencesimilarityismorerelevantthanspeciesspecificityinprobabilisticbacktranslation AT dipietrocinzia sequencesimilarityismorerelevantthanspeciesspecificityinprobabilisticbacktranslation AT purrellomichele sequencesimilarityismorerelevantthanspeciesspecificityinprobabilisticbacktranslation AT ragusamarco sequencesimilarityismorerelevantthanspeciesspecificityinprobabilisticbacktranslation

Sequence similarity is more relevant than species specificity in probabilistic backtranslation

Ejemplares similares