Cargando…
Evaluation of methods to detect circular RNAs from single-end RNA-sequencing data
BACKGROUND: Circular RNA (circRNA), a class of RNA molecule with a loop structure, has recently attracted researchers due to its diverse biological functions and potential biomarkers of human diseases. Most of the current circRNA detection methods from RNA-sequencing (RNA-Seq) data utilize the mappi...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8822704/ https://www.ncbi.nlm.nih.gov/pubmed/35135477 http://dx.doi.org/10.1186/s12864-022-08329-7 |
_version_ | 1784646652817047552 |
---|---|
author | Nguyen, Manh Hung Nguyen, Ha-Nam Vu, Trung Nghia |
author_facet | Nguyen, Manh Hung Nguyen, Ha-Nam Vu, Trung Nghia |
author_sort | Nguyen, Manh Hung |
collection | PubMed |
description | BACKGROUND: Circular RNA (circRNA), a class of RNA molecule with a loop structure, has recently attracted researchers due to its diverse biological functions and potential biomarkers of human diseases. Most of the current circRNA detection methods from RNA-sequencing (RNA-Seq) data utilize the mapping information of paired-end (PE) reads to eliminate false positives. However, much of the practical RNA-Seq data such as cross-linking immunoprecipitation sequencing (CLIP-Seq) data usually contain single-end (SE) reads. It is not clear how well these tools perform on SE RNA-Seq data. RESULTS: In this study, we present a systematic evaluation of six advanced RNA-based methods and two CLIP-Seq based methods for detecting circRNAs from SE RNA-Seq data. The performances of the methods are rigorously assessed based on precision, sensitivity, F1 score, and true discovery rate. We investigate the impacts of read length, false positive ratio, sequencing depth and PE mapping information on the performances of the methods using simulated SE RNA-Seq simulated datasets. The real datasets used in this study consist of four experimental RNA-Seq datasets with ≥100bp read length and 124 CLIP-Seq samples from 45 studies that contain mostly short-read (≤50bp) RNA-Seq data. The simulation study shows that the sensitivities of most of the methods can be improved by increasing either read length or sequencing depth, and that the levels of false positive rates significantly affect the precision of all methods. Furthermore, the PE mapping information can improve the method’s precision but can not always guarantee the increase of F1 score. Overall, no method is dominant for all SE RNA-Seq data. The RNA-based methods perform better for the long-read datasets but are worse for the short-read datasets. In contrast, the CLIP-Seq based methods outperform the RNA-Seq based methods for all the short-read samples. Combining the results of these methods can significantly improve precision in the CLIP-Seq data. CONCLUSIONS: The results provide a systematic evaluation of circRNA detection methods on SE RNA-Seq data that would facilitate researchers’ strategies in circRNA analysis. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at (10.1186/s12864-022-08329-7). |
format | Online Article Text |
id | pubmed-8822704 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-88227042022-02-08 Evaluation of methods to detect circular RNAs from single-end RNA-sequencing data Nguyen, Manh Hung Nguyen, Ha-Nam Vu, Trung Nghia BMC Genomics Research BACKGROUND: Circular RNA (circRNA), a class of RNA molecule with a loop structure, has recently attracted researchers due to its diverse biological functions and potential biomarkers of human diseases. Most of the current circRNA detection methods from RNA-sequencing (RNA-Seq) data utilize the mapping information of paired-end (PE) reads to eliminate false positives. However, much of the practical RNA-Seq data such as cross-linking immunoprecipitation sequencing (CLIP-Seq) data usually contain single-end (SE) reads. It is not clear how well these tools perform on SE RNA-Seq data. RESULTS: In this study, we present a systematic evaluation of six advanced RNA-based methods and two CLIP-Seq based methods for detecting circRNAs from SE RNA-Seq data. The performances of the methods are rigorously assessed based on precision, sensitivity, F1 score, and true discovery rate. We investigate the impacts of read length, false positive ratio, sequencing depth and PE mapping information on the performances of the methods using simulated SE RNA-Seq simulated datasets. The real datasets used in this study consist of four experimental RNA-Seq datasets with ≥100bp read length and 124 CLIP-Seq samples from 45 studies that contain mostly short-read (≤50bp) RNA-Seq data. The simulation study shows that the sensitivities of most of the methods can be improved by increasing either read length or sequencing depth, and that the levels of false positive rates significantly affect the precision of all methods. Furthermore, the PE mapping information can improve the method’s precision but can not always guarantee the increase of F1 score. Overall, no method is dominant for all SE RNA-Seq data. The RNA-based methods perform better for the long-read datasets but are worse for the short-read datasets. In contrast, the CLIP-Seq based methods outperform the RNA-Seq based methods for all the short-read samples. Combining the results of these methods can significantly improve precision in the CLIP-Seq data. CONCLUSIONS: The results provide a systematic evaluation of circRNA detection methods on SE RNA-Seq data that would facilitate researchers’ strategies in circRNA analysis. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at (10.1186/s12864-022-08329-7). BioMed Central 2022-02-08 /pmc/articles/PMC8822704/ /pubmed/35135477 http://dx.doi.org/10.1186/s12864-022-08329-7 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Nguyen, Manh Hung Nguyen, Ha-Nam Vu, Trung Nghia Evaluation of methods to detect circular RNAs from single-end RNA-sequencing data |
title | Evaluation of methods to detect circular RNAs from single-end RNA-sequencing data |
title_full | Evaluation of methods to detect circular RNAs from single-end RNA-sequencing data |
title_fullStr | Evaluation of methods to detect circular RNAs from single-end RNA-sequencing data |
title_full_unstemmed | Evaluation of methods to detect circular RNAs from single-end RNA-sequencing data |
title_short | Evaluation of methods to detect circular RNAs from single-end RNA-sequencing data |
title_sort | evaluation of methods to detect circular rnas from single-end rna-sequencing data |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8822704/ https://www.ncbi.nlm.nih.gov/pubmed/35135477 http://dx.doi.org/10.1186/s12864-022-08329-7 |
work_keys_str_mv | AT nguyenmanhhung evaluationofmethodstodetectcircularrnasfromsingleendrnasequencingdata AT nguyenhanam evaluationofmethodstodetectcircularrnasfromsingleendrnasequencingdata AT vutrungnghia evaluationofmethodstodetectcircularrnasfromsingleendrnasequencingdata |