Cargando…
NCLcomparator: systematically post-screening non-co-linear transcripts (circular, trans-spliced, or fusion RNAs) identified from various detectors
BACKGROUND: Non-co-linear (NCL) transcripts consist of exonic sequences that are topologically inconsistent with the reference genome in an intragenic fashion (circular or intragenic trans-spliced RNAs) or in an intergenic fashion (fusion or intergenic trans-spliced RNAs). On the basis of RNA-seq da...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6318855/ https://www.ncbi.nlm.nih.gov/pubmed/30606103 http://dx.doi.org/10.1186/s12859-018-2589-0 |
_version_ | 1783384956171452416 |
---|---|
author | Chen, Chia-Ying Chuang, Trees-Juen |
author_facet | Chen, Chia-Ying Chuang, Trees-Juen |
author_sort | Chen, Chia-Ying |
collection | PubMed |
description | BACKGROUND: Non-co-linear (NCL) transcripts consist of exonic sequences that are topologically inconsistent with the reference genome in an intragenic fashion (circular or intragenic trans-spliced RNAs) or in an intergenic fashion (fusion or intergenic trans-spliced RNAs). On the basis of RNA-seq data, numerous NCL event detectors have been developed and detected thousands of NCL events in diverse species. However, there are great discrepancies in the identification results among detectors, indicating a considerable proportion of false positives in the detected NCL events. Although several helpful guidelines for evaluating the performance of NCL event detectors have been provided, a systematic guideline for measurement of NCL events identified by existing tools has not been available. RESULTS: We develop a software, NCLcomparator, for systematically post-screening the intragenic or intergenic NCL events identified by various NCL detectors. NCLcomparator first examine whether the input NCL events are potentially false positives derived from ambiguous alignments (i.e., the NCL events have an alternative co-linear explanation or multiple matches against the reference genome). To evaluate the reliability of the identified NCL events, we define the NCL score (NCL(score)) based on the variation in the number of supporting NCL junction reads identified by the tools examined. Of the input NCL events, we show that the ambiguous alignment-derived events have relatively lower NCL(score) values than the other events, indicating that an NCL event with a higher NCL(score) has a higher level of reliability. To help selecting highly expressed NCL events, NCLcomparator also provides a series of useful measurements such as the expression levels of the detected NCL events and their corresponding host genes and the junction usage of the co-linear splice junctions at both NCL donor and acceptor sites. CONCLUSION: NCLcomparator provides useful guidelines, with the input of identified NCL events from various detectors and the corresponding paired-end RNA-seq data only, to help users selecting potentially high-confidence NCL events for further functional investigation. The software thus helps to facilitate future studies into NCL events, shedding light on the fundamental biology of this important but understudied class of transcripts. NCLcomparator is freely accessible at https://github.com/TreesLab/NCLcomparator. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2589-0) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-6318855 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-63188552019-01-08 NCLcomparator: systematically post-screening non-co-linear transcripts (circular, trans-spliced, or fusion RNAs) identified from various detectors Chen, Chia-Ying Chuang, Trees-Juen BMC Bioinformatics Software BACKGROUND: Non-co-linear (NCL) transcripts consist of exonic sequences that are topologically inconsistent with the reference genome in an intragenic fashion (circular or intragenic trans-spliced RNAs) or in an intergenic fashion (fusion or intergenic trans-spliced RNAs). On the basis of RNA-seq data, numerous NCL event detectors have been developed and detected thousands of NCL events in diverse species. However, there are great discrepancies in the identification results among detectors, indicating a considerable proportion of false positives in the detected NCL events. Although several helpful guidelines for evaluating the performance of NCL event detectors have been provided, a systematic guideline for measurement of NCL events identified by existing tools has not been available. RESULTS: We develop a software, NCLcomparator, for systematically post-screening the intragenic or intergenic NCL events identified by various NCL detectors. NCLcomparator first examine whether the input NCL events are potentially false positives derived from ambiguous alignments (i.e., the NCL events have an alternative co-linear explanation or multiple matches against the reference genome). To evaluate the reliability of the identified NCL events, we define the NCL score (NCL(score)) based on the variation in the number of supporting NCL junction reads identified by the tools examined. Of the input NCL events, we show that the ambiguous alignment-derived events have relatively lower NCL(score) values than the other events, indicating that an NCL event with a higher NCL(score) has a higher level of reliability. To help selecting highly expressed NCL events, NCLcomparator also provides a series of useful measurements such as the expression levels of the detected NCL events and their corresponding host genes and the junction usage of the co-linear splice junctions at both NCL donor and acceptor sites. CONCLUSION: NCLcomparator provides useful guidelines, with the input of identified NCL events from various detectors and the corresponding paired-end RNA-seq data only, to help users selecting potentially high-confidence NCL events for further functional investigation. The software thus helps to facilitate future studies into NCL events, shedding light on the fundamental biology of this important but understudied class of transcripts. NCLcomparator is freely accessible at https://github.com/TreesLab/NCLcomparator. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2589-0) contains supplementary material, which is available to authorized users. BioMed Central 2019-01-03 /pmc/articles/PMC6318855/ /pubmed/30606103 http://dx.doi.org/10.1186/s12859-018-2589-0 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Software Chen, Chia-Ying Chuang, Trees-Juen NCLcomparator: systematically post-screening non-co-linear transcripts (circular, trans-spliced, or fusion RNAs) identified from various detectors |
title | NCLcomparator: systematically post-screening non-co-linear transcripts (circular, trans-spliced, or fusion RNAs) identified from various detectors |
title_full | NCLcomparator: systematically post-screening non-co-linear transcripts (circular, trans-spliced, or fusion RNAs) identified from various detectors |
title_fullStr | NCLcomparator: systematically post-screening non-co-linear transcripts (circular, trans-spliced, or fusion RNAs) identified from various detectors |
title_full_unstemmed | NCLcomparator: systematically post-screening non-co-linear transcripts (circular, trans-spliced, or fusion RNAs) identified from various detectors |
title_short | NCLcomparator: systematically post-screening non-co-linear transcripts (circular, trans-spliced, or fusion RNAs) identified from various detectors |
title_sort | nclcomparator: systematically post-screening non-co-linear transcripts (circular, trans-spliced, or fusion rnas) identified from various detectors |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6318855/ https://www.ncbi.nlm.nih.gov/pubmed/30606103 http://dx.doi.org/10.1186/s12859-018-2589-0 |
work_keys_str_mv | AT chenchiaying nclcomparatorsystematicallypostscreeningnoncolineartranscriptscirculartranssplicedorfusionrnasidentifiedfromvariousdetectors AT chuangtreesjuen nclcomparatorsystematicallypostscreeningnoncolineartranscriptscirculartranssplicedorfusionrnasidentifiedfromvariousdetectors |