Cargando…
Biogenesis mechanisms of circular RNA can be categorized through feature extraction of a machine learning model
MOTIVATION: In recent years, multiple circular RNAs (circRNA) biogenesis mechanisms have been discovered. Although each reported mechanism has been experimentally verified in different circRNAs, no single biogenesis mechanism has been proposed that can universally explain the biogenesis of all tens...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6901070/ https://www.ncbi.nlm.nih.gov/pubmed/31529043 http://dx.doi.org/10.1093/bioinformatics/btz705 |
_version_ | 1783477448714747904 |
---|---|
author | Liu, Chengyu Liu, Yu-Chen Huang, Hsien-Da Wang, Wei |
author_facet | Liu, Chengyu Liu, Yu-Chen Huang, Hsien-Da Wang, Wei |
author_sort | Liu, Chengyu |
collection | PubMed |
description | MOTIVATION: In recent years, multiple circular RNAs (circRNA) biogenesis mechanisms have been discovered. Although each reported mechanism has been experimentally verified in different circRNAs, no single biogenesis mechanism has been proposed that can universally explain the biogenesis of all tens of thousands of discovered circRNAs. Under the hypothesis that human circRNAs can be categorized according to different biogenesis mechanisms, we designed a contextual regression model trained to predict the formation of circular RNA from a random genomic locus on human genome, with potential biogenesis factors of circular RNA as the features of the training data. RESULTS: After achieving high prediction accuracy, we found through the feature extraction technique that the examined human circRNAs can be categorized into seven subgroups, according to the presence of the following sequence features: RNA editing sites, simple repeat sequences, self-chains, RNA binding protein binding sites and CpG islands within the flanking regions of the circular RNA back-spliced junction sites. These results support all of the previously reported biogenesis mechanisms of circRNA and solidify the idea that multiple biogenesis mechanisms co-exist for different subset of human circRNAs. Furthermore, we uncover a potential new links between circRNA biogenesis and flanking CpG island. We have also identified RNA binding proteins putatively correlated with circRNA biogenesis. AVAILABILITY AND IMPLEMENTATION: Scripts and tutorial are available at http://wanglab.ucsd.edu/star/circRNA. This program is under GNU General Public License v3.0. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-6901070 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-69010702019-12-16 Biogenesis mechanisms of circular RNA can be categorized through feature extraction of a machine learning model Liu, Chengyu Liu, Yu-Chen Huang, Hsien-Da Wang, Wei Bioinformatics Discovery Note MOTIVATION: In recent years, multiple circular RNAs (circRNA) biogenesis mechanisms have been discovered. Although each reported mechanism has been experimentally verified in different circRNAs, no single biogenesis mechanism has been proposed that can universally explain the biogenesis of all tens of thousands of discovered circRNAs. Under the hypothesis that human circRNAs can be categorized according to different biogenesis mechanisms, we designed a contextual regression model trained to predict the formation of circular RNA from a random genomic locus on human genome, with potential biogenesis factors of circular RNA as the features of the training data. RESULTS: After achieving high prediction accuracy, we found through the feature extraction technique that the examined human circRNAs can be categorized into seven subgroups, according to the presence of the following sequence features: RNA editing sites, simple repeat sequences, self-chains, RNA binding protein binding sites and CpG islands within the flanking regions of the circular RNA back-spliced junction sites. These results support all of the previously reported biogenesis mechanisms of circRNA and solidify the idea that multiple biogenesis mechanisms co-exist for different subset of human circRNAs. Furthermore, we uncover a potential new links between circRNA biogenesis and flanking CpG island. We have also identified RNA binding proteins putatively correlated with circRNA biogenesis. AVAILABILITY AND IMPLEMENTATION: Scripts and tutorial are available at http://wanglab.ucsd.edu/star/circRNA. This program is under GNU General Public License v3.0. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2019-12-01 2019-09-16 /pmc/articles/PMC6901070/ /pubmed/31529043 http://dx.doi.org/10.1093/bioinformatics/btz705 Text en © The Author(s) 2019. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Discovery Note Liu, Chengyu Liu, Yu-Chen Huang, Hsien-Da Wang, Wei Biogenesis mechanisms of circular RNA can be categorized through feature extraction of a machine learning model |
title | Biogenesis mechanisms of circular RNA can be categorized through feature extraction of a machine learning model |
title_full | Biogenesis mechanisms of circular RNA can be categorized through feature extraction of a machine learning model |
title_fullStr | Biogenesis mechanisms of circular RNA can be categorized through feature extraction of a machine learning model |
title_full_unstemmed | Biogenesis mechanisms of circular RNA can be categorized through feature extraction of a machine learning model |
title_short | Biogenesis mechanisms of circular RNA can be categorized through feature extraction of a machine learning model |
title_sort | biogenesis mechanisms of circular rna can be categorized through feature extraction of a machine learning model |
topic | Discovery Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6901070/ https://www.ncbi.nlm.nih.gov/pubmed/31529043 http://dx.doi.org/10.1093/bioinformatics/btz705 |
work_keys_str_mv | AT liuchengyu biogenesismechanismsofcircularrnacanbecategorizedthroughfeatureextractionofamachinelearningmodel AT liuyuchen biogenesismechanismsofcircularrnacanbecategorizedthroughfeatureextractionofamachinelearningmodel AT huanghsienda biogenesismechanismsofcircularrnacanbecategorizedthroughfeatureextractionofamachinelearningmodel AT wangwei biogenesismechanismsofcircularrnacanbecategorizedthroughfeatureextractionofamachinelearningmodel |