Cargando…

A common class of transcripts with 5′-intron depletion, distinct early coding sequence features, and N(1)-methyladenosine modification

Introns are found in 5′ untranslated regions (5′UTRs) for 35% of all human transcripts. These 5′UTR introns are not randomly distributed: Genes that encode secreted, membrane-bound and mitochondrial proteins are less likely to have them. Curiously, transcripts lacking 5′UTR introns tend to harbor sp...

Descripción completa

Detalles Bibliográficos
Autores principales: Cenik, Can, Chua, Hon Nian, Singh, Guramrit, Akef, Abdalla, Snyder, Michael P., Palazzo, Alexander F., Moore, Melissa J., Roth, Frederick P.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5311483/
https://www.ncbi.nlm.nih.gov/pubmed/27994090
http://dx.doi.org/10.1261/rna.059105.116
_version_ 1782508031113691136
author Cenik, Can
Chua, Hon Nian
Singh, Guramrit
Akef, Abdalla
Snyder, Michael P.
Palazzo, Alexander F.
Moore, Melissa J.
Roth, Frederick P.
author_facet Cenik, Can
Chua, Hon Nian
Singh, Guramrit
Akef, Abdalla
Snyder, Michael P.
Palazzo, Alexander F.
Moore, Melissa J.
Roth, Frederick P.
author_sort Cenik, Can
collection PubMed
description Introns are found in 5′ untranslated regions (5′UTRs) for 35% of all human transcripts. These 5′UTR introns are not randomly distributed: Genes that encode secreted, membrane-bound and mitochondrial proteins are less likely to have them. Curiously, transcripts lacking 5′UTR introns tend to harbor specific RNA sequence elements in their early coding regions. To model and understand the connection between coding-region sequence and 5′UTR intron status, we developed a classifier that can predict 5′UTR intron status with >80% accuracy using only sequence features in the early coding region. Thus, the classifier identifies transcripts with 5′ proximal-intron-minus-like-coding regions (“5IM” transcripts). Unexpectedly, we found that the early coding sequence features defining 5IM transcripts are widespread, appearing in 21% of all human RefSeq transcripts. The 5IM class of transcripts is enriched for non-AUG start codons, more extensive secondary structure both preceding the start codon and near the 5′ cap, greater dependence on eIF4E for translation, and association with ER-proximal ribosomes. 5IM transcripts are bound by the exon junction complex (EJC) at noncanonical 5′ proximal positions. Finally, N(1)-methyladenosines are specifically enriched in the early coding regions of 5IM transcripts. Taken together, our analyses point to the existence of a distinct 5IM class comprising ∼20% of human transcripts. This class is defined by depletion of 5′ proximal introns, presence of specific RNA sequence features associated with low translation efficiency, N(1)-methyladenosines in the early coding region, and enrichment for noncanonical binding by the EJC.
format Online
Article
Text
id pubmed-5311483
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-53114832017-03-01 A common class of transcripts with 5′-intron depletion, distinct early coding sequence features, and N(1)-methyladenosine modification Cenik, Can Chua, Hon Nian Singh, Guramrit Akef, Abdalla Snyder, Michael P. Palazzo, Alexander F. Moore, Melissa J. Roth, Frederick P. RNA Bioinformatics Introns are found in 5′ untranslated regions (5′UTRs) for 35% of all human transcripts. These 5′UTR introns are not randomly distributed: Genes that encode secreted, membrane-bound and mitochondrial proteins are less likely to have them. Curiously, transcripts lacking 5′UTR introns tend to harbor specific RNA sequence elements in their early coding regions. To model and understand the connection between coding-region sequence and 5′UTR intron status, we developed a classifier that can predict 5′UTR intron status with >80% accuracy using only sequence features in the early coding region. Thus, the classifier identifies transcripts with 5′ proximal-intron-minus-like-coding regions (“5IM” transcripts). Unexpectedly, we found that the early coding sequence features defining 5IM transcripts are widespread, appearing in 21% of all human RefSeq transcripts. The 5IM class of transcripts is enriched for non-AUG start codons, more extensive secondary structure both preceding the start codon and near the 5′ cap, greater dependence on eIF4E for translation, and association with ER-proximal ribosomes. 5IM transcripts are bound by the exon junction complex (EJC) at noncanonical 5′ proximal positions. Finally, N(1)-methyladenosines are specifically enriched in the early coding regions of 5IM transcripts. Taken together, our analyses point to the existence of a distinct 5IM class comprising ∼20% of human transcripts. This class is defined by depletion of 5′ proximal introns, presence of specific RNA sequence features associated with low translation efficiency, N(1)-methyladenosines in the early coding region, and enrichment for noncanonical binding by the EJC. Cold Spring Harbor Laboratory Press 2017-03 /pmc/articles/PMC5311483/ /pubmed/27994090 http://dx.doi.org/10.1261/rna.059105.116 Text en © 2017 Cenik et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society http://creativecommons.org/licenses/by/4.0/ This article, published in RNA, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.
spellingShingle Bioinformatics
Cenik, Can
Chua, Hon Nian
Singh, Guramrit
Akef, Abdalla
Snyder, Michael P.
Palazzo, Alexander F.
Moore, Melissa J.
Roth, Frederick P.
A common class of transcripts with 5′-intron depletion, distinct early coding sequence features, and N(1)-methyladenosine modification
title A common class of transcripts with 5′-intron depletion, distinct early coding sequence features, and N(1)-methyladenosine modification
title_full A common class of transcripts with 5′-intron depletion, distinct early coding sequence features, and N(1)-methyladenosine modification
title_fullStr A common class of transcripts with 5′-intron depletion, distinct early coding sequence features, and N(1)-methyladenosine modification
title_full_unstemmed A common class of transcripts with 5′-intron depletion, distinct early coding sequence features, and N(1)-methyladenosine modification
title_short A common class of transcripts with 5′-intron depletion, distinct early coding sequence features, and N(1)-methyladenosine modification
title_sort common class of transcripts with 5′-intron depletion, distinct early coding sequence features, and n(1)-methyladenosine modification
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5311483/
https://www.ncbi.nlm.nih.gov/pubmed/27994090
http://dx.doi.org/10.1261/rna.059105.116
work_keys_str_mv AT cenikcan acommonclassoftranscriptswith5introndepletiondistinctearlycodingsequencefeaturesandn1methyladenosinemodification
AT chuahonnian acommonclassoftranscriptswith5introndepletiondistinctearlycodingsequencefeaturesandn1methyladenosinemodification
AT singhguramrit acommonclassoftranscriptswith5introndepletiondistinctearlycodingsequencefeaturesandn1methyladenosinemodification
AT akefabdalla acommonclassoftranscriptswith5introndepletiondistinctearlycodingsequencefeaturesandn1methyladenosinemodification
AT snydermichaelp acommonclassoftranscriptswith5introndepletiondistinctearlycodingsequencefeaturesandn1methyladenosinemodification
AT palazzoalexanderf acommonclassoftranscriptswith5introndepletiondistinctearlycodingsequencefeaturesandn1methyladenosinemodification
AT mooremelissaj acommonclassoftranscriptswith5introndepletiondistinctearlycodingsequencefeaturesandn1methyladenosinemodification
AT rothfrederickp acommonclassoftranscriptswith5introndepletiondistinctearlycodingsequencefeaturesandn1methyladenosinemodification
AT cenikcan commonclassoftranscriptswith5introndepletiondistinctearlycodingsequencefeaturesandn1methyladenosinemodification
AT chuahonnian commonclassoftranscriptswith5introndepletiondistinctearlycodingsequencefeaturesandn1methyladenosinemodification
AT singhguramrit commonclassoftranscriptswith5introndepletiondistinctearlycodingsequencefeaturesandn1methyladenosinemodification
AT akefabdalla commonclassoftranscriptswith5introndepletiondistinctearlycodingsequencefeaturesandn1methyladenosinemodification
AT snydermichaelp commonclassoftranscriptswith5introndepletiondistinctearlycodingsequencefeaturesandn1methyladenosinemodification
AT palazzoalexanderf commonclassoftranscriptswith5introndepletiondistinctearlycodingsequencefeaturesandn1methyladenosinemodification
AT mooremelissaj commonclassoftranscriptswith5introndepletiondistinctearlycodingsequencefeaturesandn1methyladenosinemodification
AT rothfrederickp commonclassoftranscriptswith5introndepletiondistinctearlycodingsequencefeaturesandn1methyladenosinemodification