Cargando…

GC-AG Introns Features in Long Non-coding and Protein-Coding Genes Suggest Their Role in Gene Expression Regulation

Long non-coding RNAs (lncRNAs) are recognized as an important class of regulatory molecules involved in a variety of biological functions. However, the regulatory mechanisms of long non-coding genes expression are still poorly understood. The characterization of the genomic features of lncRNAs is cr...

Descripción completa

Detalles Bibliográficos
Autores principales: Abou Alezz, Monah, Celli, Ludovica, Belotti, Giulia, Lisa, Antonella, Bione, Silvia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7242645/
https://www.ncbi.nlm.nih.gov/pubmed/32499820
http://dx.doi.org/10.3389/fgene.2020.00488
_version_ 1783537270688579584
author Abou Alezz, Monah
Celli, Ludovica
Belotti, Giulia
Lisa, Antonella
Bione, Silvia
author_facet Abou Alezz, Monah
Celli, Ludovica
Belotti, Giulia
Lisa, Antonella
Bione, Silvia
author_sort Abou Alezz, Monah
collection PubMed
description Long non-coding RNAs (lncRNAs) are recognized as an important class of regulatory molecules involved in a variety of biological functions. However, the regulatory mechanisms of long non-coding genes expression are still poorly understood. The characterization of the genomic features of lncRNAs is crucial to get insight into their function. In this study, we exploited recent annotations by GENCODE to characterize the genomic and splicing features of long non-coding genes in comparison with protein-coding ones, both in human and mouse. Our analysis highlighted differences between the two classes of genes in terms of their gene architecture. Significant differences in the splice sites usage were observed between long non-coding and protein-coding genes (PCG). While the frequency of non-canonical GC-AG splice junctions represents about 0.8% of total splice sites in PCGs, we identified a significant enrichment of the GC-AG splice sites in long non-coding genes, both in human (3.0%) and mouse (1.9%). In addition, we found a positional bias of GC-AG splice sites being enriched in the first intron in both classes of genes. Moreover, a significant shorter length and weaker donor and acceptor sites were found comparing GC-AG introns to GT-AG introns. Genes containing at least one GC-AG intron were found conserved in many species, more prone to alternative splicing and a functional analysis pointed toward their enrichment in specific biological processes such as DNA repair. Our study shows for the first time that GC-AG introns are mainly associated with lncRNAs and are preferentially located in the first intron. Additionally, we discovered their regulatory potential indicating the existence of a new mechanism of non-coding and PCGs expression regulation.
format Online
Article
Text
id pubmed-7242645
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-72426452020-06-03 GC-AG Introns Features in Long Non-coding and Protein-Coding Genes Suggest Their Role in Gene Expression Regulation Abou Alezz, Monah Celli, Ludovica Belotti, Giulia Lisa, Antonella Bione, Silvia Front Genet Genetics Long non-coding RNAs (lncRNAs) are recognized as an important class of regulatory molecules involved in a variety of biological functions. However, the regulatory mechanisms of long non-coding genes expression are still poorly understood. The characterization of the genomic features of lncRNAs is crucial to get insight into their function. In this study, we exploited recent annotations by GENCODE to characterize the genomic and splicing features of long non-coding genes in comparison with protein-coding ones, both in human and mouse. Our analysis highlighted differences between the two classes of genes in terms of their gene architecture. Significant differences in the splice sites usage were observed between long non-coding and protein-coding genes (PCG). While the frequency of non-canonical GC-AG splice junctions represents about 0.8% of total splice sites in PCGs, we identified a significant enrichment of the GC-AG splice sites in long non-coding genes, both in human (3.0%) and mouse (1.9%). In addition, we found a positional bias of GC-AG splice sites being enriched in the first intron in both classes of genes. Moreover, a significant shorter length and weaker donor and acceptor sites were found comparing GC-AG introns to GT-AG introns. Genes containing at least one GC-AG intron were found conserved in many species, more prone to alternative splicing and a functional analysis pointed toward their enrichment in specific biological processes such as DNA repair. Our study shows for the first time that GC-AG introns are mainly associated with lncRNAs and are preferentially located in the first intron. Additionally, we discovered their regulatory potential indicating the existence of a new mechanism of non-coding and PCGs expression regulation. Frontiers Media S.A. 2020-05-15 /pmc/articles/PMC7242645/ /pubmed/32499820 http://dx.doi.org/10.3389/fgene.2020.00488 Text en Copyright © 2020 Abou Alezz, Celli, Belotti, Lisa and Bione. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Abou Alezz, Monah
Celli, Ludovica
Belotti, Giulia
Lisa, Antonella
Bione, Silvia
GC-AG Introns Features in Long Non-coding and Protein-Coding Genes Suggest Their Role in Gene Expression Regulation
title GC-AG Introns Features in Long Non-coding and Protein-Coding Genes Suggest Their Role in Gene Expression Regulation
title_full GC-AG Introns Features in Long Non-coding and Protein-Coding Genes Suggest Their Role in Gene Expression Regulation
title_fullStr GC-AG Introns Features in Long Non-coding and Protein-Coding Genes Suggest Their Role in Gene Expression Regulation
title_full_unstemmed GC-AG Introns Features in Long Non-coding and Protein-Coding Genes Suggest Their Role in Gene Expression Regulation
title_short GC-AG Introns Features in Long Non-coding and Protein-Coding Genes Suggest Their Role in Gene Expression Regulation
title_sort gc-ag introns features in long non-coding and protein-coding genes suggest their role in gene expression regulation
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7242645/
https://www.ncbi.nlm.nih.gov/pubmed/32499820
http://dx.doi.org/10.3389/fgene.2020.00488
work_keys_str_mv AT aboualezzmonah gcagintronsfeaturesinlongnoncodingandproteincodinggenessuggesttheirroleingeneexpressionregulation
AT celliludovica gcagintronsfeaturesinlongnoncodingandproteincodinggenessuggesttheirroleingeneexpressionregulation
AT belottigiulia gcagintronsfeaturesinlongnoncodingandproteincodinggenessuggesttheirroleingeneexpressionregulation
AT lisaantonella gcagintronsfeaturesinlongnoncodingandproteincodinggenessuggesttheirroleingeneexpressionregulation
AT bionesilvia gcagintronsfeaturesinlongnoncodingandproteincodinggenessuggesttheirroleingeneexpressionregulation