Cargando…

Effects of using coding potential, sequence conservation and mRNA structure conservation for predicting pyrrolysine containing genes

BACKGROUND: Pyrrolysine (the 22nd amino acid) is in certain organisms and under certain circumstances encoded by the amber stop codon, UAG. The circumstances driving pyrrolysine translation are not well understood. The involvement of a predicted mRNA structure in the region downstream UAG has been s...

Descripción completa

Detalles Bibliográficos
Autores principales: Theil Have, Christian, Zambach, Sine, Christiansen, Henning
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3639795/
https://www.ncbi.nlm.nih.gov/pubmed/23557142
http://dx.doi.org/10.1186/1471-2105-14-118
_version_ 1782475990563291136
author Theil Have, Christian
Zambach, Sine
Christiansen, Henning
author_facet Theil Have, Christian
Zambach, Sine
Christiansen, Henning
author_sort Theil Have, Christian
collection PubMed
description BACKGROUND: Pyrrolysine (the 22nd amino acid) is in certain organisms and under certain circumstances encoded by the amber stop codon, UAG. The circumstances driving pyrrolysine translation are not well understood. The involvement of a predicted mRNA structure in the region downstream UAG has been suggested, but the structure does not seem to be present in all pyrrolysine incorporating genes. RESULTS: We propose a strategy to predict pyrrolysine encoding genes in genomes of archaea and bacteria. We cluster open reading frames interrupted by the amber codon based on sequence similarity. We rank these clusters according to several features that may influence pyrrolysine translation. The ranking effects of different features are assessed and we propose a weighted combination of these features which best explains the currently known pyrrolysine incorporating genes. We devote special attention to the effect of structural conservation and provide further substantiation to support that structural conservation may be influential – but is not a necessary factor. Finally, from the weighted ranking, we identify a number of potentially pyrrolysine incorporating genes. CONCLUSIONS: We propose a method for prediction of pyrrolysine incorporating genes in genomes of bacteria and archaea leading to insights about the factors driving pyrrolysine translation and identification of new gene candidates. The method predicts known conserved genes with high recall and predicts several other promising candidates for experimental verification. The method is implemented as a computational pipeline which is available on request.
format Online
Article
Text
id pubmed-3639795
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-36397952013-05-06 Effects of using coding potential, sequence conservation and mRNA structure conservation for predicting pyrrolysine containing genes Theil Have, Christian Zambach, Sine Christiansen, Henning BMC Bioinformatics Research Article BACKGROUND: Pyrrolysine (the 22nd amino acid) is in certain organisms and under certain circumstances encoded by the amber stop codon, UAG. The circumstances driving pyrrolysine translation are not well understood. The involvement of a predicted mRNA structure in the region downstream UAG has been suggested, but the structure does not seem to be present in all pyrrolysine incorporating genes. RESULTS: We propose a strategy to predict pyrrolysine encoding genes in genomes of archaea and bacteria. We cluster open reading frames interrupted by the amber codon based on sequence similarity. We rank these clusters according to several features that may influence pyrrolysine translation. The ranking effects of different features are assessed and we propose a weighted combination of these features which best explains the currently known pyrrolysine incorporating genes. We devote special attention to the effect of structural conservation and provide further substantiation to support that structural conservation may be influential – but is not a necessary factor. Finally, from the weighted ranking, we identify a number of potentially pyrrolysine incorporating genes. CONCLUSIONS: We propose a method for prediction of pyrrolysine incorporating genes in genomes of bacteria and archaea leading to insights about the factors driving pyrrolysine translation and identification of new gene candidates. The method predicts known conserved genes with high recall and predicts several other promising candidates for experimental verification. The method is implemented as a computational pipeline which is available on request. BioMed Central 2013-04-04 /pmc/articles/PMC3639795/ /pubmed/23557142 http://dx.doi.org/10.1186/1471-2105-14-118 Text en Copyright © 2013 Theil Have et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Theil Have, Christian
Zambach, Sine
Christiansen, Henning
Effects of using coding potential, sequence conservation and mRNA structure conservation for predicting pyrrolysine containing genes
title Effects of using coding potential, sequence conservation and mRNA structure conservation for predicting pyrrolysine containing genes
title_full Effects of using coding potential, sequence conservation and mRNA structure conservation for predicting pyrrolysine containing genes
title_fullStr Effects of using coding potential, sequence conservation and mRNA structure conservation for predicting pyrrolysine containing genes
title_full_unstemmed Effects of using coding potential, sequence conservation and mRNA structure conservation for predicting pyrrolysine containing genes
title_short Effects of using coding potential, sequence conservation and mRNA structure conservation for predicting pyrrolysine containing genes
title_sort effects of using coding potential, sequence conservation and mrna structure conservation for predicting pyrrolysine containing genes
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3639795/
https://www.ncbi.nlm.nih.gov/pubmed/23557142
http://dx.doi.org/10.1186/1471-2105-14-118
work_keys_str_mv AT theilhavechristian effectsofusingcodingpotentialsequenceconservationandmrnastructureconservationforpredictingpyrrolysinecontaininggenes
AT zambachsine effectsofusingcodingpotentialsequenceconservationandmrnastructureconservationforpredictingpyrrolysinecontaininggenes
AT christiansenhenning effectsofusingcodingpotentialsequenceconservationandmrnastructureconservationforpredictingpyrrolysinecontaininggenes