Cargando…
Unconstrained mining of transcript data reveals increased alternative splicing complexity in the human transcriptome
Mining massive amounts of transcript data for alternative splicing information is paramount to help understand how the maturation of RNA regulates gene expression. We developed an algorithm to cluster transcript data to annotated genes to detect unannotated splice variants. A higher number of altern...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2919708/ https://www.ncbi.nlm.nih.gov/pubmed/20385588 http://dx.doi.org/10.1093/nar/gkq197 |
_version_ | 1782185211649327104 |
---|---|
author | Mollet, I. G. Ben-Dov, Claudia Felício-Silva, Daniel Grosso, A. R. Eleutério, Pedro Alves, Ruben Staller, Ray Silva, Tito Santos Carmo-Fonseca, Maria |
author_facet | Mollet, I. G. Ben-Dov, Claudia Felício-Silva, Daniel Grosso, A. R. Eleutério, Pedro Alves, Ruben Staller, Ray Silva, Tito Santos Carmo-Fonseca, Maria |
author_sort | Mollet, I. G. |
collection | PubMed |
description | Mining massive amounts of transcript data for alternative splicing information is paramount to help understand how the maturation of RNA regulates gene expression. We developed an algorithm to cluster transcript data to annotated genes to detect unannotated splice variants. A higher number of alternatively spliced genes and isoforms were found compared to other alternative splicing databases. Comparison of human and mouse data revealed a marked increase, in human, of splice variants incorporating novel exons and retained introns. Previously unannotated exons were validated by tiling array expression data and shown to correspond preferentially to novel first exons. Retained introns were validated by tiling array and deep sequencing data. The majority of retained introns were shorter than 500 nt and had weak polypyrimidine tracts. A subset of retained introns matching small RNAs and displaying a high GC content suggests a possible coordination between splicing regulation and production of noncoding RNAs. Conservation of unannotated exons and retained introns was higher in horse, dog and cow than in rodents, and 64% of exon sequences were only found in primates. This analysis highlights previously bypassed alternative splice variants, which may be crucial to deciphering more complex pathways of gene regulation in human. |
format | Text |
id | pubmed-2919708 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2010 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-29197082010-08-11 Unconstrained mining of transcript data reveals increased alternative splicing complexity in the human transcriptome Mollet, I. G. Ben-Dov, Claudia Felício-Silva, Daniel Grosso, A. R. Eleutério, Pedro Alves, Ruben Staller, Ray Silva, Tito Santos Carmo-Fonseca, Maria Nucleic Acids Res Genomics Mining massive amounts of transcript data for alternative splicing information is paramount to help understand how the maturation of RNA regulates gene expression. We developed an algorithm to cluster transcript data to annotated genes to detect unannotated splice variants. A higher number of alternatively spliced genes and isoforms were found compared to other alternative splicing databases. Comparison of human and mouse data revealed a marked increase, in human, of splice variants incorporating novel exons and retained introns. Previously unannotated exons were validated by tiling array expression data and shown to correspond preferentially to novel first exons. Retained introns were validated by tiling array and deep sequencing data. The majority of retained introns were shorter than 500 nt and had weak polypyrimidine tracts. A subset of retained introns matching small RNAs and displaying a high GC content suggests a possible coordination between splicing regulation and production of noncoding RNAs. Conservation of unannotated exons and retained introns was higher in horse, dog and cow than in rodents, and 64% of exon sequences were only found in primates. This analysis highlights previously bypassed alternative splice variants, which may be crucial to deciphering more complex pathways of gene regulation in human. Oxford University Press 2010-08 2010-04-12 /pmc/articles/PMC2919708/ /pubmed/20385588 http://dx.doi.org/10.1093/nar/gkq197 Text en © The Author(s) 2010. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.5 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Genomics Mollet, I. G. Ben-Dov, Claudia Felício-Silva, Daniel Grosso, A. R. Eleutério, Pedro Alves, Ruben Staller, Ray Silva, Tito Santos Carmo-Fonseca, Maria Unconstrained mining of transcript data reveals increased alternative splicing complexity in the human transcriptome |
title | Unconstrained mining of transcript data reveals increased alternative splicing complexity in the human transcriptome |
title_full | Unconstrained mining of transcript data reveals increased alternative splicing complexity in the human transcriptome |
title_fullStr | Unconstrained mining of transcript data reveals increased alternative splicing complexity in the human transcriptome |
title_full_unstemmed | Unconstrained mining of transcript data reveals increased alternative splicing complexity in the human transcriptome |
title_short | Unconstrained mining of transcript data reveals increased alternative splicing complexity in the human transcriptome |
title_sort | unconstrained mining of transcript data reveals increased alternative splicing complexity in the human transcriptome |
topic | Genomics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2919708/ https://www.ncbi.nlm.nih.gov/pubmed/20385588 http://dx.doi.org/10.1093/nar/gkq197 |
work_keys_str_mv | AT molletig unconstrainedminingoftranscriptdatarevealsincreasedalternativesplicingcomplexityinthehumantranscriptome AT bendovclaudia unconstrainedminingoftranscriptdatarevealsincreasedalternativesplicingcomplexityinthehumantranscriptome AT feliciosilvadaniel unconstrainedminingoftranscriptdatarevealsincreasedalternativesplicingcomplexityinthehumantranscriptome AT grossoar unconstrainedminingoftranscriptdatarevealsincreasedalternativesplicingcomplexityinthehumantranscriptome AT eleuteriopedro unconstrainedminingoftranscriptdatarevealsincreasedalternativesplicingcomplexityinthehumantranscriptome AT alvesruben unconstrainedminingoftranscriptdatarevealsincreasedalternativesplicingcomplexityinthehumantranscriptome AT stallerray unconstrainedminingoftranscriptdatarevealsincreasedalternativesplicingcomplexityinthehumantranscriptome AT silvatitosantos unconstrainedminingoftranscriptdatarevealsincreasedalternativesplicingcomplexityinthehumantranscriptome AT carmofonsecamaria unconstrainedminingoftranscriptdatarevealsincreasedalternativesplicingcomplexityinthehumantranscriptome |