Cargando…

Predicting Shine–Dalgarno Sequence Locations Exposes Genome Annotation Errors

In prokaryotes, Shine–Dalgarno (SD) sequences, nucleotides upstream from start codons on messenger RNAs (mRNAs) that are complementary to ribosomal RNA (rRNA), facilitate the initiation of protein synthesis. The location of SD sequences relative to start codons and the stability of the hybridization...

Descripción completa

Detalles Bibliográficos
Autores principales: Starmer, J, Stomp, A, Vouk, M, Bitzer, D
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1463019/
https://www.ncbi.nlm.nih.gov/pubmed/16710451
http://dx.doi.org/10.1371/journal.pcbi.0020057
_version_ 1782127528581791744
author Starmer, J
Stomp, A
Vouk, M
Bitzer, D
author_facet Starmer, J
Stomp, A
Vouk, M
Bitzer, D
author_sort Starmer, J
collection PubMed
description In prokaryotes, Shine–Dalgarno (SD) sequences, nucleotides upstream from start codons on messenger RNAs (mRNAs) that are complementary to ribosomal RNA (rRNA), facilitate the initiation of protein synthesis. The location of SD sequences relative to start codons and the stability of the hybridization between the mRNA and the rRNA correlate with the rate of synthesis. Thus, accurate characterization of SD sequences enhances our understanding of how an organism's transcriptome relates to its cellular proteome. We implemented the Individual Nearest Neighbor Hydrogen Bond model for oligo–oligo hybridization and created a new metric, relative spacing (RS), to identify both the location and the hybridization potential of SD sequences by simulating the binding between mRNAs and single-stranded 16S rRNA 3′ tails. In 18 prokaryote genomes, we identified 2,420 genes out of 58,550 where the strongest binding in the translation initiation region included the start codon, deviating from the expected location for the SD sequence of five to ten bases upstream. We designated these as RS+1 genes. Additional analysis uncovered an unusual bias of the start codon in that the majority of the RS+1 genes used GUG, not AUG. Furthermore, of the 624 RS+1 genes whose SD sequence was associated with a free energy release of less than −8.4 kcal/mol (strong RS+1 genes), 384 were within 12 nucleotides upstream of in-frame initiation codons. The most likely explanation for the unexpected location of the SD sequence for these 384 genes is mis-annotation of the start codon. In this way, the new RS metric provides an improved method for gene sequence annotation. The remaining strong RS+1 genes appear to have their SD sequences in an unexpected location that includes the start codon. Thus, our RS metric provides a new way to explore the role of rRNA–mRNA nucleotide hybridization in translation initiation.
format Text
id pubmed-1463019
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-14630192006-05-26 Predicting Shine–Dalgarno Sequence Locations Exposes Genome Annotation Errors Starmer, J Stomp, A Vouk, M Bitzer, D PLoS Comput Biol Research Article In prokaryotes, Shine–Dalgarno (SD) sequences, nucleotides upstream from start codons on messenger RNAs (mRNAs) that are complementary to ribosomal RNA (rRNA), facilitate the initiation of protein synthesis. The location of SD sequences relative to start codons and the stability of the hybridization between the mRNA and the rRNA correlate with the rate of synthesis. Thus, accurate characterization of SD sequences enhances our understanding of how an organism's transcriptome relates to its cellular proteome. We implemented the Individual Nearest Neighbor Hydrogen Bond model for oligo–oligo hybridization and created a new metric, relative spacing (RS), to identify both the location and the hybridization potential of SD sequences by simulating the binding between mRNAs and single-stranded 16S rRNA 3′ tails. In 18 prokaryote genomes, we identified 2,420 genes out of 58,550 where the strongest binding in the translation initiation region included the start codon, deviating from the expected location for the SD sequence of five to ten bases upstream. We designated these as RS+1 genes. Additional analysis uncovered an unusual bias of the start codon in that the majority of the RS+1 genes used GUG, not AUG. Furthermore, of the 624 RS+1 genes whose SD sequence was associated with a free energy release of less than −8.4 kcal/mol (strong RS+1 genes), 384 were within 12 nucleotides upstream of in-frame initiation codons. The most likely explanation for the unexpected location of the SD sequence for these 384 genes is mis-annotation of the start codon. In this way, the new RS metric provides an improved method for gene sequence annotation. The remaining strong RS+1 genes appear to have their SD sequences in an unexpected location that includes the start codon. Thus, our RS metric provides a new way to explore the role of rRNA–mRNA nucleotide hybridization in translation initiation. Public Library of Science 2006-05 2006-05-19 /pmc/articles/PMC1463019/ /pubmed/16710451 http://dx.doi.org/10.1371/journal.pcbi.0020057 Text en © 2006 Starmer et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Starmer, J
Stomp, A
Vouk, M
Bitzer, D
Predicting Shine–Dalgarno Sequence Locations Exposes Genome Annotation Errors
title Predicting Shine–Dalgarno Sequence Locations Exposes Genome Annotation Errors
title_full Predicting Shine–Dalgarno Sequence Locations Exposes Genome Annotation Errors
title_fullStr Predicting Shine–Dalgarno Sequence Locations Exposes Genome Annotation Errors
title_full_unstemmed Predicting Shine–Dalgarno Sequence Locations Exposes Genome Annotation Errors
title_short Predicting Shine–Dalgarno Sequence Locations Exposes Genome Annotation Errors
title_sort predicting shine–dalgarno sequence locations exposes genome annotation errors
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1463019/
https://www.ncbi.nlm.nih.gov/pubmed/16710451
http://dx.doi.org/10.1371/journal.pcbi.0020057
work_keys_str_mv AT starmerj predictingshinedalgarnosequencelocationsexposesgenomeannotationerrors
AT stompa predictingshinedalgarnosequencelocationsexposesgenomeannotationerrors
AT voukm predictingshinedalgarnosequencelocationsexposesgenomeannotationerrors
AT bitzerd predictingshinedalgarnosequencelocationsexposesgenomeannotationerrors