Cargando…
Transcript features alone enable accurate prediction and understanding of gene expression in S. cerevisiae
BACKGROUND: Gene expression is a central process in all living organisms. Central questions in the field are related to the way the expression levels of genes are encoded in the transcripts and affect their evolution, and the potential to predict expression levels solely by transcript features. In t...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3852043/ https://www.ncbi.nlm.nih.gov/pubmed/24564391 http://dx.doi.org/10.1186/1471-2105-14-S15-S1 |
_version_ | 1782294404124377088 |
---|---|
author | Zur, Hadas Tuller, Tamir |
author_facet | Zur, Hadas Tuller, Tamir |
author_sort | Zur, Hadas |
collection | PubMed |
description | BACKGROUND: Gene expression is a central process in all living organisms. Central questions in the field are related to the way the expression levels of genes are encoded in the transcripts and affect their evolution, and the potential to predict expression levels solely by transcript features. In this study we analyze S. cerevisiae, a model organism with the most abundant relevant cellular and genomic measurements, to evaluate the accuracy in which expression levels can be predicted by different parts of the transcript. To this end, we perform various types of regression analyses based on a total of 5323 features of the transcript. The main advantage of the proposed predictors over previous ones is related to the accurate and comprehensive definitions of the relevant transcript features, which are based on biophysical knowledge of the gene transcription and translation processes, their modeling and evolution. RESULTS: Cross validation analyses of our predictors demonstrate that they achieve a correlation of 0.68/0.68/0.70/0.61/0.81 with mRNA levels, ribosomal density, protein levels, proteins per mRNA molecule (PPR), and ribosomal load (RL) respectively (all p-values < [Formula: see text]). When we consider predictors that are based exclusively on the features related to different parts of the transcript (5'UTR, ORF, 3'UTR), the correlations with protein levels were 0.27/0.71/0.25 (all p-values < [Formula: see text]), suggesting that the information in the UTRs is redundant, and features of the ORF alone yield similar predictions to the ones obtained based on the entire transcript. CONCLUSIONS: The reported results demonstrate that in the analyzed model organism the expression levels of a gene are encoded in the transcript. Specifically, the prediction of a large fraction of the variance of the different gene expression steps based on transcript features alone is feasible in S. cerevisiae. We report dozens of novel transcript features related to expression levels predictions, demonstrating how such analyses can aid in understanding the gene expression process and its evolution, and how such predictors can be designed for other organisms in the future. |
format | Online Article Text |
id | pubmed-3852043 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-38520432013-12-20 Transcript features alone enable accurate prediction and understanding of gene expression in S. cerevisiae Zur, Hadas Tuller, Tamir BMC Bioinformatics Proceedings BACKGROUND: Gene expression is a central process in all living organisms. Central questions in the field are related to the way the expression levels of genes are encoded in the transcripts and affect their evolution, and the potential to predict expression levels solely by transcript features. In this study we analyze S. cerevisiae, a model organism with the most abundant relevant cellular and genomic measurements, to evaluate the accuracy in which expression levels can be predicted by different parts of the transcript. To this end, we perform various types of regression analyses based on a total of 5323 features of the transcript. The main advantage of the proposed predictors over previous ones is related to the accurate and comprehensive definitions of the relevant transcript features, which are based on biophysical knowledge of the gene transcription and translation processes, their modeling and evolution. RESULTS: Cross validation analyses of our predictors demonstrate that they achieve a correlation of 0.68/0.68/0.70/0.61/0.81 with mRNA levels, ribosomal density, protein levels, proteins per mRNA molecule (PPR), and ribosomal load (RL) respectively (all p-values < [Formula: see text]). When we consider predictors that are based exclusively on the features related to different parts of the transcript (5'UTR, ORF, 3'UTR), the correlations with protein levels were 0.27/0.71/0.25 (all p-values < [Formula: see text]), suggesting that the information in the UTRs is redundant, and features of the ORF alone yield similar predictions to the ones obtained based on the entire transcript. CONCLUSIONS: The reported results demonstrate that in the analyzed model organism the expression levels of a gene are encoded in the transcript. Specifically, the prediction of a large fraction of the variance of the different gene expression steps based on transcript features alone is feasible in S. cerevisiae. We report dozens of novel transcript features related to expression levels predictions, demonstrating how such analyses can aid in understanding the gene expression process and its evolution, and how such predictors can be designed for other organisms in the future. BioMed Central 2013-10-15 /pmc/articles/PMC3852043/ /pubmed/24564391 http://dx.doi.org/10.1186/1471-2105-14-S15-S1 Text en Copyright © 2013 Zur and Tuller; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Proceedings Zur, Hadas Tuller, Tamir Transcript features alone enable accurate prediction and understanding of gene expression in S. cerevisiae |
title | Transcript features alone enable accurate prediction and understanding of gene expression in S. cerevisiae |
title_full | Transcript features alone enable accurate prediction and understanding of gene expression in S. cerevisiae |
title_fullStr | Transcript features alone enable accurate prediction and understanding of gene expression in S. cerevisiae |
title_full_unstemmed | Transcript features alone enable accurate prediction and understanding of gene expression in S. cerevisiae |
title_short | Transcript features alone enable accurate prediction and understanding of gene expression in S. cerevisiae |
title_sort | transcript features alone enable accurate prediction and understanding of gene expression in s. cerevisiae |
topic | Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3852043/ https://www.ncbi.nlm.nih.gov/pubmed/24564391 http://dx.doi.org/10.1186/1471-2105-14-S15-S1 |
work_keys_str_mv | AT zurhadas transcriptfeaturesaloneenableaccuratepredictionandunderstandingofgeneexpressioninscerevisiae AT tullertamir transcriptfeaturesaloneenableaccuratepredictionandunderstandingofgeneexpressioninscerevisiae |