Cargando…

Novel pipeline identifies new upstream ORFs and non-AUG initiating main ORFs with conserved amino acid sequences in the 5′ leader of mRNAs in Arabidopsis thaliana

Eukaryotic mRNAs contain a 5′ leader sequence preceding the main open reading frame (mORF) and, depending on the species, 20%–50% of eukaryotic mRNAs harbor an upstream ORF (uORF) in the 5′ leader. An unknown fraction of these uORFs encode sequence conserved peptides (conserved peptide uORFs, CPuORF...

Descripción completa

Detalles Bibliográficos
Autores principales: van der Horst, Sjors, Snel, Berend, Hanson, Johannes, Smeekens, Sjef
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6380273/
https://www.ncbi.nlm.nih.gov/pubmed/30567971
http://dx.doi.org/10.1261/rna.067983.118
_version_ 1783396285576904704
author van der Horst, Sjors
Snel, Berend
Hanson, Johannes
Smeekens, Sjef
author_facet van der Horst, Sjors
Snel, Berend
Hanson, Johannes
Smeekens, Sjef
author_sort van der Horst, Sjors
collection PubMed
description Eukaryotic mRNAs contain a 5′ leader sequence preceding the main open reading frame (mORF) and, depending on the species, 20%–50% of eukaryotic mRNAs harbor an upstream ORF (uORF) in the 5′ leader. An unknown fraction of these uORFs encode sequence conserved peptides (conserved peptide uORFs, CPuORFs). Experimentally validated CPuORFs demonstrated to regulate the translation of downstream mORFs often do so in a metabolite concentration-dependent manner. Previous research has shown that most CPuORFs possess a start codon context suboptimal for translation initiation, which turns out to be favorable for translational regulation. The suboptimal initiation context may even include non-AUG start codons, which makes CPuORFs hard to predict. For this reason, we developed a novel pipeline to identify CPuORFs unbiased of start codon using well-annotated sequence data from 31 eudicot plant species and rice. Our new pipeline was able to identify 29 novel Arabidopsis thaliana (Arabidopsis) CPuORFs, conserved across a wide variety of eudicot species of which 15 do not initiate with an AUG start codon. In addition to CPuORFs, the pipeline was able to find 14 conserved coding regions directly upstream and in frame with the mORF, which likely initiate translation on a non-AUG start codon. Altogether, our pipeline identified highly conserved coding regions in the 5′ leaders of Arabidopsis transcripts, including in genes with proven functional importance such as LHY, a key regulator of the circadian clock, and the RAPTOR1 subunit of the target of rapamycin (TOR) kinase.
format Online
Article
Text
id pubmed-6380273
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-63802732019-03-09 Novel pipeline identifies new upstream ORFs and non-AUG initiating main ORFs with conserved amino acid sequences in the 5′ leader of mRNAs in Arabidopsis thaliana van der Horst, Sjors Snel, Berend Hanson, Johannes Smeekens, Sjef RNA Bioinformatics Eukaryotic mRNAs contain a 5′ leader sequence preceding the main open reading frame (mORF) and, depending on the species, 20%–50% of eukaryotic mRNAs harbor an upstream ORF (uORF) in the 5′ leader. An unknown fraction of these uORFs encode sequence conserved peptides (conserved peptide uORFs, CPuORFs). Experimentally validated CPuORFs demonstrated to regulate the translation of downstream mORFs often do so in a metabolite concentration-dependent manner. Previous research has shown that most CPuORFs possess a start codon context suboptimal for translation initiation, which turns out to be favorable for translational regulation. The suboptimal initiation context may even include non-AUG start codons, which makes CPuORFs hard to predict. For this reason, we developed a novel pipeline to identify CPuORFs unbiased of start codon using well-annotated sequence data from 31 eudicot plant species and rice. Our new pipeline was able to identify 29 novel Arabidopsis thaliana (Arabidopsis) CPuORFs, conserved across a wide variety of eudicot species of which 15 do not initiate with an AUG start codon. In addition to CPuORFs, the pipeline was able to find 14 conserved coding regions directly upstream and in frame with the mORF, which likely initiate translation on a non-AUG start codon. Altogether, our pipeline identified highly conserved coding regions in the 5′ leaders of Arabidopsis transcripts, including in genes with proven functional importance such as LHY, a key regulator of the circadian clock, and the RAPTOR1 subunit of the target of rapamycin (TOR) kinase. Cold Spring Harbor Laboratory Press 2019-03 /pmc/articles/PMC6380273/ /pubmed/30567971 http://dx.doi.org/10.1261/rna.067983.118 Text en © 2019 van der Horst et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society http://creativecommons.org/licenses/by-nc/4.0/ This article, published in RNA, is available undera Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.
spellingShingle Bioinformatics
van der Horst, Sjors
Snel, Berend
Hanson, Johannes
Smeekens, Sjef
Novel pipeline identifies new upstream ORFs and non-AUG initiating main ORFs with conserved amino acid sequences in the 5′ leader of mRNAs in Arabidopsis thaliana
title Novel pipeline identifies new upstream ORFs and non-AUG initiating main ORFs with conserved amino acid sequences in the 5′ leader of mRNAs in Arabidopsis thaliana
title_full Novel pipeline identifies new upstream ORFs and non-AUG initiating main ORFs with conserved amino acid sequences in the 5′ leader of mRNAs in Arabidopsis thaliana
title_fullStr Novel pipeline identifies new upstream ORFs and non-AUG initiating main ORFs with conserved amino acid sequences in the 5′ leader of mRNAs in Arabidopsis thaliana
title_full_unstemmed Novel pipeline identifies new upstream ORFs and non-AUG initiating main ORFs with conserved amino acid sequences in the 5′ leader of mRNAs in Arabidopsis thaliana
title_short Novel pipeline identifies new upstream ORFs and non-AUG initiating main ORFs with conserved amino acid sequences in the 5′ leader of mRNAs in Arabidopsis thaliana
title_sort novel pipeline identifies new upstream orfs and non-aug initiating main orfs with conserved amino acid sequences in the 5′ leader of mrnas in arabidopsis thaliana
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6380273/
https://www.ncbi.nlm.nih.gov/pubmed/30567971
http://dx.doi.org/10.1261/rna.067983.118
work_keys_str_mv AT vanderhorstsjors novelpipelineidentifiesnewupstreamorfsandnonauginitiatingmainorfswithconservedaminoacidsequencesinthe5leaderofmrnasinarabidopsisthaliana
AT snelberend novelpipelineidentifiesnewupstreamorfsandnonauginitiatingmainorfswithconservedaminoacidsequencesinthe5leaderofmrnasinarabidopsisthaliana
AT hansonjohannes novelpipelineidentifiesnewupstreamorfsandnonauginitiatingmainorfswithconservedaminoacidsequencesinthe5leaderofmrnasinarabidopsisthaliana
AT smeekenssjef novelpipelineidentifiesnewupstreamorfsandnonauginitiatingmainorfswithconservedaminoacidsequencesinthe5leaderofmrnasinarabidopsisthaliana