Cargando…

Improving the prediction accuracy of protein abundance in Escherichia coli using mRNA accessibility

RNA secondary structure around translation initiation sites strongly affects the abundance of expressed proteins in Escherichia coli. However, detailed secondary structural features governing protein abundance remain elusive. Recent advances in high-throughput DNA synthesis and experimental systems...

Descripción completa

Detalles Bibliográficos
Autores principales: Terai, Goro, Asai, Kiyoshi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7641306/
https://www.ncbi.nlm.nih.gov/pubmed/32504488
http://dx.doi.org/10.1093/nar/gkaa481
_version_ 1783605889014431744
author Terai, Goro
Asai, Kiyoshi
author_facet Terai, Goro
Asai, Kiyoshi
author_sort Terai, Goro
collection PubMed
description RNA secondary structure around translation initiation sites strongly affects the abundance of expressed proteins in Escherichia coli. However, detailed secondary structural features governing protein abundance remain elusive. Recent advances in high-throughput DNA synthesis and experimental systems enable us to obtain large amounts of data. Here, we evaluated six types of structural features using two large-scale datasets. We found that accessibility, which is the probability that a given region around the start codon has no base-paired nucleotides, showed the highest correlation with protein abundance in both datasets. Accessibility showed a significantly higher correlation (Spearman’s ρ = 0.709) than the widely used minimum free energy (0.554) in one of the datasets. Interestingly, accessibility showed the highest correlation only when it was calculated by a log-linear model, indicating that the RNA structural model and how to utilize it are important. Furthermore, by combining the accessibility and activity of the Shine-Dalgarno sequence, we devised a method for predicting protein abundance more accurately than existing methods. We inferred that the log-linear model has a broader probabilistic distribution than the widely used Turner energy model, which contributed to more accurate quantification of ribosome accessibility to translation initiation sites.
format Online
Article
Text
id pubmed-7641306
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-76413062020-11-10 Improving the prediction accuracy of protein abundance in Escherichia coli using mRNA accessibility Terai, Goro Asai, Kiyoshi Nucleic Acids Res Methods Online RNA secondary structure around translation initiation sites strongly affects the abundance of expressed proteins in Escherichia coli. However, detailed secondary structural features governing protein abundance remain elusive. Recent advances in high-throughput DNA synthesis and experimental systems enable us to obtain large amounts of data. Here, we evaluated six types of structural features using two large-scale datasets. We found that accessibility, which is the probability that a given region around the start codon has no base-paired nucleotides, showed the highest correlation with protein abundance in both datasets. Accessibility showed a significantly higher correlation (Spearman’s ρ = 0.709) than the widely used minimum free energy (0.554) in one of the datasets. Interestingly, accessibility showed the highest correlation only when it was calculated by a log-linear model, indicating that the RNA structural model and how to utilize it are important. Furthermore, by combining the accessibility and activity of the Shine-Dalgarno sequence, we devised a method for predicting protein abundance more accurately than existing methods. We inferred that the log-linear model has a broader probabilistic distribution than the widely used Turner energy model, which contributed to more accurate quantification of ribosome accessibility to translation initiation sites. Oxford University Press 2020-06-06 /pmc/articles/PMC7641306/ /pubmed/32504488 http://dx.doi.org/10.1093/nar/gkaa481 Text en © The Author(s) 2020. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Methods Online
Terai, Goro
Asai, Kiyoshi
Improving the prediction accuracy of protein abundance in Escherichia coli using mRNA accessibility
title Improving the prediction accuracy of protein abundance in Escherichia coli using mRNA accessibility
title_full Improving the prediction accuracy of protein abundance in Escherichia coli using mRNA accessibility
title_fullStr Improving the prediction accuracy of protein abundance in Escherichia coli using mRNA accessibility
title_full_unstemmed Improving the prediction accuracy of protein abundance in Escherichia coli using mRNA accessibility
title_short Improving the prediction accuracy of protein abundance in Escherichia coli using mRNA accessibility
title_sort improving the prediction accuracy of protein abundance in escherichia coli using mrna accessibility
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7641306/
https://www.ncbi.nlm.nih.gov/pubmed/32504488
http://dx.doi.org/10.1093/nar/gkaa481
work_keys_str_mv AT teraigoro improvingthepredictionaccuracyofproteinabundanceinescherichiacoliusingmrnaaccessibility
AT asaikiyoshi improvingthepredictionaccuracyofproteinabundanceinescherichiacoliusingmrnaaccessibility