Cargando…
Improved circRNA Identification by Combining Prediction Algorithms
Non-coding RNA is an interesting class of gene regulators with diverse functionalities. One large subgroup of non-coding RNAs is the recently discovered class of circular RNAs (circRNAs). CircRNAs are conserved and expressed in a tissue and developmental specific manner, although for the vast majori...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5844931/ https://www.ncbi.nlm.nih.gov/pubmed/29556495 http://dx.doi.org/10.3389/fcell.2018.00020 |
_version_ | 1783305320711323648 |
---|---|
author | Hansen, Thomas B. |
author_facet | Hansen, Thomas B. |
author_sort | Hansen, Thomas B. |
collection | PubMed |
description | Non-coding RNA is an interesting class of gene regulators with diverse functionalities. One large subgroup of non-coding RNAs is the recently discovered class of circular RNAs (circRNAs). CircRNAs are conserved and expressed in a tissue and developmental specific manner, although for the vast majority, the functional relevance remains unclear. To identify and quantify circRNAs expression, several bioinformatic pipelines have been developed to assess the catalog of circRNAs in any given total RNA sequencing dataset. We recently compared five different algorithms for circRNA detection, but here this analysis is extended to 11 algorithms. By comparing the number of circRNAs discovered and their respective sensitivity to RNaseR digestion, the sensitivity and specificity of each algorithm are evaluated. Moreover, the ability to predict de novo circRNA, i.e., circRNAs not derived from annotated splice sites, is also determined as well as the effect of eliminating low quality and adaptor-containing reads prior to circRNA prediction. Finally, and most importantly, all possible pair-wise combinations of algorithms are tested and guidelines for algorithm complementarity are provided. Conclusively, the algorithms mostly agree on highly expressed circRNAs, however, in many cases, algorithm-specific false positives with high read counts are predicted, which is resolved by using the shared output from two (or more) algorithms. |
format | Online Article Text |
id | pubmed-5844931 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-58449312018-03-19 Improved circRNA Identification by Combining Prediction Algorithms Hansen, Thomas B. Front Cell Dev Biol Cell and Developmental Biology Non-coding RNA is an interesting class of gene regulators with diverse functionalities. One large subgroup of non-coding RNAs is the recently discovered class of circular RNAs (circRNAs). CircRNAs are conserved and expressed in a tissue and developmental specific manner, although for the vast majority, the functional relevance remains unclear. To identify and quantify circRNAs expression, several bioinformatic pipelines have been developed to assess the catalog of circRNAs in any given total RNA sequencing dataset. We recently compared five different algorithms for circRNA detection, but here this analysis is extended to 11 algorithms. By comparing the number of circRNAs discovered and their respective sensitivity to RNaseR digestion, the sensitivity and specificity of each algorithm are evaluated. Moreover, the ability to predict de novo circRNA, i.e., circRNAs not derived from annotated splice sites, is also determined as well as the effect of eliminating low quality and adaptor-containing reads prior to circRNA prediction. Finally, and most importantly, all possible pair-wise combinations of algorithms are tested and guidelines for algorithm complementarity are provided. Conclusively, the algorithms mostly agree on highly expressed circRNAs, however, in many cases, algorithm-specific false positives with high read counts are predicted, which is resolved by using the shared output from two (or more) algorithms. Frontiers Media S.A. 2018-03-05 /pmc/articles/PMC5844931/ /pubmed/29556495 http://dx.doi.org/10.3389/fcell.2018.00020 Text en Copyright © 2018 Hansen. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Cell and Developmental Biology Hansen, Thomas B. Improved circRNA Identification by Combining Prediction Algorithms |
title | Improved circRNA Identification by Combining Prediction Algorithms |
title_full | Improved circRNA Identification by Combining Prediction Algorithms |
title_fullStr | Improved circRNA Identification by Combining Prediction Algorithms |
title_full_unstemmed | Improved circRNA Identification by Combining Prediction Algorithms |
title_short | Improved circRNA Identification by Combining Prediction Algorithms |
title_sort | improved circrna identification by combining prediction algorithms |
topic | Cell and Developmental Biology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5844931/ https://www.ncbi.nlm.nih.gov/pubmed/29556495 http://dx.doi.org/10.3389/fcell.2018.00020 |
work_keys_str_mv | AT hansenthomasb improvedcircrnaidentificationbycombiningpredictionalgorithms |