Cargando…

Improving the prediction accuracy of protein abundance in Escherichia coli using mRNA accessibility

RNA secondary structure around translation initiation sites strongly affects the abundance of expressed proteins in Escherichia coli. However, detailed secondary structural features governing protein abundance remain elusive. Recent advances in high-throughput DNA synthesis and experimental systems...

Descripción completa

Detalles Bibliográficos
Autores principales: Terai, Goro, Asai, Kiyoshi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7641306/
https://www.ncbi.nlm.nih.gov/pubmed/32504488
http://dx.doi.org/10.1093/nar/gkaa481
Descripción
Sumario:RNA secondary structure around translation initiation sites strongly affects the abundance of expressed proteins in Escherichia coli. However, detailed secondary structural features governing protein abundance remain elusive. Recent advances in high-throughput DNA synthesis and experimental systems enable us to obtain large amounts of data. Here, we evaluated six types of structural features using two large-scale datasets. We found that accessibility, which is the probability that a given region around the start codon has no base-paired nucleotides, showed the highest correlation with protein abundance in both datasets. Accessibility showed a significantly higher correlation (Spearman’s ρ = 0.709) than the widely used minimum free energy (0.554) in one of the datasets. Interestingly, accessibility showed the highest correlation only when it was calculated by a log-linear model, indicating that the RNA structural model and how to utilize it are important. Furthermore, by combining the accessibility and activity of the Shine-Dalgarno sequence, we devised a method for predicting protein abundance more accurately than existing methods. We inferred that the log-linear model has a broader probabilistic distribution than the widely used Turner energy model, which contributed to more accurate quantification of ribosome accessibility to translation initiation sites.