Cargando…

Evaluating the bias of circRNA predictions from total RNA-Seq data

CircRNAs are a group of endogenous noncoding RNAs. The quickly developing high throughput RNA sequencing technologies along with novel bioinformatics approaches have enabled researchers to systematically identify circRNAs and their biological functions in cells. Deep sequencing of rRNA-depleted RNAs...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Jinzeng, Liu, Kang, Liu, Ya, Lv, Qi, Zhang, Fan, Wang, Haiyun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Impact Journals LLC 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5762294/
https://www.ncbi.nlm.nih.gov/pubmed/29340026
http://dx.doi.org/10.18632/oncotarget.22972
_version_ 1783291654537478144
author Wang, Jinzeng
Liu, Kang
Liu, Ya
Lv, Qi
Zhang, Fan
Wang, Haiyun
author_facet Wang, Jinzeng
Liu, Kang
Liu, Ya
Lv, Qi
Zhang, Fan
Wang, Haiyun
author_sort Wang, Jinzeng
collection PubMed
description CircRNAs are a group of endogenous noncoding RNAs. The quickly developing high throughput RNA sequencing technologies along with novel bioinformatics approaches have enabled researchers to systematically identify circRNAs and their biological functions in cells. Deep sequencing of rRNA-depleted RNAs treated with RNase R, which digests linear RNAs and leaves circRNAs enriched, is an efficient way to identify circRNAs. However, very few of RNase R treated data are at hand but a large amount of total RNA-Seq data with no sequencing costs is available, for circRNA predictions. In this study, we systematically investigated the prediction bias from total RNA-Seq data as well as the influence of sequencing depth, sequencing quality and single-end or paired-end sequencing strategy on the predictions. We also identified circRNA properties that may contribute to the improved prediction performance. Our analysis shows that circRNA predictions from total RNA-Seq data gain ∼50% true positive. Sequencing error dramatically worsens the predictions, rather than single-end sequencing strategy or low sequencing depth. However, false positive can be carefully controlled by using data with good quality and narrowing down circRNAs guided by their properties.
format Online
Article
Text
id pubmed-5762294
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Impact Journals LLC
record_format MEDLINE/PubMed
spelling pubmed-57622942018-01-16 Evaluating the bias of circRNA predictions from total RNA-Seq data Wang, Jinzeng Liu, Kang Liu, Ya Lv, Qi Zhang, Fan Wang, Haiyun Oncotarget Research Paper CircRNAs are a group of endogenous noncoding RNAs. The quickly developing high throughput RNA sequencing technologies along with novel bioinformatics approaches have enabled researchers to systematically identify circRNAs and their biological functions in cells. Deep sequencing of rRNA-depleted RNAs treated with RNase R, which digests linear RNAs and leaves circRNAs enriched, is an efficient way to identify circRNAs. However, very few of RNase R treated data are at hand but a large amount of total RNA-Seq data with no sequencing costs is available, for circRNA predictions. In this study, we systematically investigated the prediction bias from total RNA-Seq data as well as the influence of sequencing depth, sequencing quality and single-end or paired-end sequencing strategy on the predictions. We also identified circRNA properties that may contribute to the improved prediction performance. Our analysis shows that circRNA predictions from total RNA-Seq data gain ∼50% true positive. Sequencing error dramatically worsens the predictions, rather than single-end sequencing strategy or low sequencing depth. However, false positive can be carefully controlled by using data with good quality and narrowing down circRNAs guided by their properties. Impact Journals LLC 2017-12-06 /pmc/articles/PMC5762294/ /pubmed/29340026 http://dx.doi.org/10.18632/oncotarget.22972 Text en Copyright: © 2017 Wang et al. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/) 3.0 (CC BY 3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Paper
Wang, Jinzeng
Liu, Kang
Liu, Ya
Lv, Qi
Zhang, Fan
Wang, Haiyun
Evaluating the bias of circRNA predictions from total RNA-Seq data
title Evaluating the bias of circRNA predictions from total RNA-Seq data
title_full Evaluating the bias of circRNA predictions from total RNA-Seq data
title_fullStr Evaluating the bias of circRNA predictions from total RNA-Seq data
title_full_unstemmed Evaluating the bias of circRNA predictions from total RNA-Seq data
title_short Evaluating the bias of circRNA predictions from total RNA-Seq data
title_sort evaluating the bias of circrna predictions from total rna-seq data
topic Research Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5762294/
https://www.ncbi.nlm.nih.gov/pubmed/29340026
http://dx.doi.org/10.18632/oncotarget.22972
work_keys_str_mv AT wangjinzeng evaluatingthebiasofcircrnapredictionsfromtotalrnaseqdata
AT liukang evaluatingthebiasofcircrnapredictionsfromtotalrnaseqdata
AT liuya evaluatingthebiasofcircrnapredictionsfromtotalrnaseqdata
AT lvqi evaluatingthebiasofcircrnapredictionsfromtotalrnaseqdata
AT zhangfan evaluatingthebiasofcircrnapredictionsfromtotalrnaseqdata
AT wanghaiyun evaluatingthebiasofcircrnapredictionsfromtotalrnaseqdata