Cargando…

Ribosome signatures aid bacterial translation initiation site identification

BACKGROUND: While methods for annotation of genes are increasingly reliable, the exact identification of translation initiation sites remains a challenging problem. Since the N-termini of proteins often contain regulatory and targeting information, developing a robust method for start site identific...

Descripción completa

Detalles Bibliográficos
Autores principales: Giess, Adam, Jonckheere, Veronique, Ndah, Elvis, Chyżyńska, Katarzyna, Van Damme, Petra, Valen, Eivind
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5576327/
https://www.ncbi.nlm.nih.gov/pubmed/28854918
http://dx.doi.org/10.1186/s12915-017-0416-0
Descripción
Sumario:BACKGROUND: While methods for annotation of genes are increasingly reliable, the exact identification of translation initiation sites remains a challenging problem. Since the N-termini of proteins often contain regulatory and targeting information, developing a robust method for start site identification is crucial. Ribosome profiling reads show distinct patterns of read length distributions around translation initiation sites. These patterns are typically lost in standard ribosome profiling analysis pipelines, when reads from footprints are adjusted to determine the specific codon being translated. RESULTS: Utilising these signatures in combination with nucleotide sequence information, we build a model capable of predicting translation initiation sites and demonstrate its high accuracy using N-terminal proteomics. Applying this to prokaryotic translatomes, we re-annotate translation initiation sites and provide evidence of N-terminal truncations and extensions of previously annotated coding sequences. These re-annotations are supported by the presence of structural and sequence-based features next to N-terminal peptide evidence. Finally, our model identifies 61 novel genes previously undiscovered in the Salmonella enterica genome. CONCLUSIONS: Signatures within ribosome profiling read length distributions can be used in combination with nucleotide sequence information to provide accurate genome-wide identification of translation initiation sites. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12915-017-0416-0) contains supplementary material, which is available to authorized users.