Cargando…

Biogenesis mechanisms of circular RNA can be categorized through feature extraction of a machine learning model

MOTIVATION: In recent years, multiple circular RNAs (circRNA) biogenesis mechanisms have been discovered. Although each reported mechanism has been experimentally verified in different circRNAs, no single biogenesis mechanism has been proposed that can universally explain the biogenesis of all tens...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Chengyu, Liu, Yu-Chen, Huang, Hsien-Da, Wang, Wei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6901070/
https://www.ncbi.nlm.nih.gov/pubmed/31529043
http://dx.doi.org/10.1093/bioinformatics/btz705
_version_ 1783477448714747904
author Liu, Chengyu
Liu, Yu-Chen
Huang, Hsien-Da
Wang, Wei
author_facet Liu, Chengyu
Liu, Yu-Chen
Huang, Hsien-Da
Wang, Wei
author_sort Liu, Chengyu
collection PubMed
description MOTIVATION: In recent years, multiple circular RNAs (circRNA) biogenesis mechanisms have been discovered. Although each reported mechanism has been experimentally verified in different circRNAs, no single biogenesis mechanism has been proposed that can universally explain the biogenesis of all tens of thousands of discovered circRNAs. Under the hypothesis that human circRNAs can be categorized according to different biogenesis mechanisms, we designed a contextual regression model trained to predict the formation of circular RNA from a random genomic locus on human genome, with potential biogenesis factors of circular RNA as the features of the training data. RESULTS: After achieving high prediction accuracy, we found through the feature extraction technique that the examined human circRNAs can be categorized into seven subgroups, according to the presence of the following sequence features: RNA editing sites, simple repeat sequences, self-chains, RNA binding protein binding sites and CpG islands within the flanking regions of the circular RNA back-spliced junction sites. These results support all of the previously reported biogenesis mechanisms of circRNA and solidify the idea that multiple biogenesis mechanisms co-exist for different subset of human circRNAs. Furthermore, we uncover a potential new links between circRNA biogenesis and flanking CpG island. We have also identified RNA binding proteins putatively correlated with circRNA biogenesis. AVAILABILITY AND IMPLEMENTATION: Scripts and tutorial are available at http://wanglab.ucsd.edu/star/circRNA. This program is under GNU General Public License v3.0. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-6901070
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-69010702019-12-16 Biogenesis mechanisms of circular RNA can be categorized through feature extraction of a machine learning model Liu, Chengyu Liu, Yu-Chen Huang, Hsien-Da Wang, Wei Bioinformatics Discovery Note MOTIVATION: In recent years, multiple circular RNAs (circRNA) biogenesis mechanisms have been discovered. Although each reported mechanism has been experimentally verified in different circRNAs, no single biogenesis mechanism has been proposed that can universally explain the biogenesis of all tens of thousands of discovered circRNAs. Under the hypothesis that human circRNAs can be categorized according to different biogenesis mechanisms, we designed a contextual regression model trained to predict the formation of circular RNA from a random genomic locus on human genome, with potential biogenesis factors of circular RNA as the features of the training data. RESULTS: After achieving high prediction accuracy, we found through the feature extraction technique that the examined human circRNAs can be categorized into seven subgroups, according to the presence of the following sequence features: RNA editing sites, simple repeat sequences, self-chains, RNA binding protein binding sites and CpG islands within the flanking regions of the circular RNA back-spliced junction sites. These results support all of the previously reported biogenesis mechanisms of circRNA and solidify the idea that multiple biogenesis mechanisms co-exist for different subset of human circRNAs. Furthermore, we uncover a potential new links between circRNA biogenesis and flanking CpG island. We have also identified RNA binding proteins putatively correlated with circRNA biogenesis. AVAILABILITY AND IMPLEMENTATION: Scripts and tutorial are available at http://wanglab.ucsd.edu/star/circRNA. This program is under GNU General Public License v3.0. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2019-12-01 2019-09-16 /pmc/articles/PMC6901070/ /pubmed/31529043 http://dx.doi.org/10.1093/bioinformatics/btz705 Text en © The Author(s) 2019. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Discovery Note
Liu, Chengyu
Liu, Yu-Chen
Huang, Hsien-Da
Wang, Wei
Biogenesis mechanisms of circular RNA can be categorized through feature extraction of a machine learning model
title Biogenesis mechanisms of circular RNA can be categorized through feature extraction of a machine learning model
title_full Biogenesis mechanisms of circular RNA can be categorized through feature extraction of a machine learning model
title_fullStr Biogenesis mechanisms of circular RNA can be categorized through feature extraction of a machine learning model
title_full_unstemmed Biogenesis mechanisms of circular RNA can be categorized through feature extraction of a machine learning model
title_short Biogenesis mechanisms of circular RNA can be categorized through feature extraction of a machine learning model
title_sort biogenesis mechanisms of circular rna can be categorized through feature extraction of a machine learning model
topic Discovery Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6901070/
https://www.ncbi.nlm.nih.gov/pubmed/31529043
http://dx.doi.org/10.1093/bioinformatics/btz705
work_keys_str_mv AT liuchengyu biogenesismechanismsofcircularrnacanbecategorizedthroughfeatureextractionofamachinelearningmodel
AT liuyuchen biogenesismechanismsofcircularrnacanbecategorizedthroughfeatureextractionofamachinelearningmodel
AT huanghsienda biogenesismechanismsofcircularrnacanbecategorizedthroughfeatureextractionofamachinelearningmodel
AT wangwei biogenesismechanismsofcircularrnacanbecategorizedthroughfeatureextractionofamachinelearningmodel