Cargando…

Circall: fast and accurate methodology for discovery of circular RNAs from paired-end RNA-sequencing data

BACKGROUND: Circular RNA (circRNA) is an emerging class of RNA molecules attracting researchers due to its potential for serving as markers for diagnosis, prognosis, or therapeutic targets of cancer, cardiovascular, and autoimmune diseases. Current methods for detection of circRNA from RNA sequencin...

Descripción completa

Detalles Bibliográficos
Autores principales: Nguyen, Dat Thanh, Trac, Quang Thinh, Nguyen, Thi-Hau, Nguyen, Ha-Nam, Ohad, Nir, Pawitan, Yudi, Vu, Trung Nghia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8513298/
https://www.ncbi.nlm.nih.gov/pubmed/34645386
http://dx.doi.org/10.1186/s12859-021-04418-8
_version_ 1784583185322999808
author Nguyen, Dat Thanh
Trac, Quang Thinh
Nguyen, Thi-Hau
Nguyen, Ha-Nam
Ohad, Nir
Pawitan, Yudi
Vu, Trung Nghia
author_facet Nguyen, Dat Thanh
Trac, Quang Thinh
Nguyen, Thi-Hau
Nguyen, Ha-Nam
Ohad, Nir
Pawitan, Yudi
Vu, Trung Nghia
author_sort Nguyen, Dat Thanh
collection PubMed
description BACKGROUND: Circular RNA (circRNA) is an emerging class of RNA molecules attracting researchers due to its potential for serving as markers for diagnosis, prognosis, or therapeutic targets of cancer, cardiovascular, and autoimmune diseases. Current methods for detection of circRNA from RNA sequencing (RNA-seq) focus mostly on improving mapping quality of reads supporting the back-splicing junction (BSJ) of a circRNA to eliminate false positives (FPs). We show that mapping information alone often cannot predict if a BSJ-supporting read is derived from a true circRNA or not, thus increasing the rate of FP circRNAs. RESULTS: We have developed Circall, a novel circRNA detection method from RNA-seq. Circall controls the FPs using a robust multidimensional local false discovery rate method based on the length and expression of circRNAs. It is computationally highly efficient by using a quasi-mapping algorithm for fast and accurate RNA read alignments. We applied Circall on two simulated datasets and three experimental datasets of human cell-lines. The results show that Circall achieves high sensitivity and precision in the simulated data. In the experimental datasets it performs well against current leading methods. Circall is also substantially faster than the other methods, particularly for large datasets. CONCLUSIONS: With those better performances in the detection of circRNAs and in computational time, Circall facilitates the analyses of circRNAs in large numbers of samples. Circall is implemented in C++ and R, and available for use at https://www.meb.ki.se/sites/biostatwiki/circall and https://github.com/datngu/Circall.
format Online
Article
Text
id pubmed-8513298
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-85132982021-10-20 Circall: fast and accurate methodology for discovery of circular RNAs from paired-end RNA-sequencing data Nguyen, Dat Thanh Trac, Quang Thinh Nguyen, Thi-Hau Nguyen, Ha-Nam Ohad, Nir Pawitan, Yudi Vu, Trung Nghia BMC Bioinformatics Research BACKGROUND: Circular RNA (circRNA) is an emerging class of RNA molecules attracting researchers due to its potential for serving as markers for diagnosis, prognosis, or therapeutic targets of cancer, cardiovascular, and autoimmune diseases. Current methods for detection of circRNA from RNA sequencing (RNA-seq) focus mostly on improving mapping quality of reads supporting the back-splicing junction (BSJ) of a circRNA to eliminate false positives (FPs). We show that mapping information alone often cannot predict if a BSJ-supporting read is derived from a true circRNA or not, thus increasing the rate of FP circRNAs. RESULTS: We have developed Circall, a novel circRNA detection method from RNA-seq. Circall controls the FPs using a robust multidimensional local false discovery rate method based on the length and expression of circRNAs. It is computationally highly efficient by using a quasi-mapping algorithm for fast and accurate RNA read alignments. We applied Circall on two simulated datasets and three experimental datasets of human cell-lines. The results show that Circall achieves high sensitivity and precision in the simulated data. In the experimental datasets it performs well against current leading methods. Circall is also substantially faster than the other methods, particularly for large datasets. CONCLUSIONS: With those better performances in the detection of circRNAs and in computational time, Circall facilitates the analyses of circRNAs in large numbers of samples. Circall is implemented in C++ and R, and available for use at https://www.meb.ki.se/sites/biostatwiki/circall and https://github.com/datngu/Circall. BioMed Central 2021-10-13 /pmc/articles/PMC8513298/ /pubmed/34645386 http://dx.doi.org/10.1186/s12859-021-04418-8 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Nguyen, Dat Thanh
Trac, Quang Thinh
Nguyen, Thi-Hau
Nguyen, Ha-Nam
Ohad, Nir
Pawitan, Yudi
Vu, Trung Nghia
Circall: fast and accurate methodology for discovery of circular RNAs from paired-end RNA-sequencing data
title Circall: fast and accurate methodology for discovery of circular RNAs from paired-end RNA-sequencing data
title_full Circall: fast and accurate methodology for discovery of circular RNAs from paired-end RNA-sequencing data
title_fullStr Circall: fast and accurate methodology for discovery of circular RNAs from paired-end RNA-sequencing data
title_full_unstemmed Circall: fast and accurate methodology for discovery of circular RNAs from paired-end RNA-sequencing data
title_short Circall: fast and accurate methodology for discovery of circular RNAs from paired-end RNA-sequencing data
title_sort circall: fast and accurate methodology for discovery of circular rnas from paired-end rna-sequencing data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8513298/
https://www.ncbi.nlm.nih.gov/pubmed/34645386
http://dx.doi.org/10.1186/s12859-021-04418-8
work_keys_str_mv AT nguyendatthanh circallfastandaccuratemethodologyfordiscoveryofcircularrnasfrompairedendrnasequencingdata
AT tracquangthinh circallfastandaccuratemethodologyfordiscoveryofcircularrnasfrompairedendrnasequencingdata
AT nguyenthihau circallfastandaccuratemethodologyfordiscoveryofcircularrnasfrompairedendrnasequencingdata
AT nguyenhanam circallfastandaccuratemethodologyfordiscoveryofcircularrnasfrompairedendrnasequencingdata
AT ohadnir circallfastandaccuratemethodologyfordiscoveryofcircularrnasfrompairedendrnasequencingdata
AT pawitanyudi circallfastandaccuratemethodologyfordiscoveryofcircularrnasfrompairedendrnasequencingdata
AT vutrungnghia circallfastandaccuratemethodologyfordiscoveryofcircularrnasfrompairedendrnasequencingdata