Cargando…

NCLscan: accurate identification of non-co-linear transcripts (fusion, trans-splicing and circular RNA) with a good balance between sensitivity and precision

Analysis of RNA-seq data often detects numerous ‘non-co-linear’ (NCL) transcripts, which comprised sequence segments that are topologically inconsistent with their corresponding DNA sequences in the reference genome. However, detection of NCL transcripts involves two major challenges: removal of fal...

Descripción completa

Detalles Bibliográficos
Autores principales: Chuang, Trees-Juen, Wu, Chan-Shuo, Chen, Chia-Ying, Hung, Li-Yuan, Chiang, Tai-Wei, Yang, Min-Yu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4756807/
https://www.ncbi.nlm.nih.gov/pubmed/26442529
http://dx.doi.org/10.1093/nar/gkv1013
_version_ 1782416398464581632
author Chuang, Trees-Juen
Wu, Chan-Shuo
Chen, Chia-Ying
Hung, Li-Yuan
Chiang, Tai-Wei
Yang, Min-Yu
author_facet Chuang, Trees-Juen
Wu, Chan-Shuo
Chen, Chia-Ying
Hung, Li-Yuan
Chiang, Tai-Wei
Yang, Min-Yu
author_sort Chuang, Trees-Juen
collection PubMed
description Analysis of RNA-seq data often detects numerous ‘non-co-linear’ (NCL) transcripts, which comprised sequence segments that are topologically inconsistent with their corresponding DNA sequences in the reference genome. However, detection of NCL transcripts involves two major challenges: removal of false positives arising from alignment artifacts and discrimination between different types of NCL transcripts (trans-spliced, circular or fusion transcripts). Here, we developed a new NCL-transcript-detecting method (‘NCLscan’), which utilized a stepwise alignment strategy to almost completely eliminate false calls (>98% precision) without sacrificing true positives, enabling NCLscan outperform 18 other publicly-available tools (including fusion- and circular-RNA-detecting tools) in terms of sensitivity and precision, regardless of the generation strategy of simulated dataset, type of intragenic or intergenic NCL event, read depth of coverage, read length or expression level of NCL transcript. With the high accuracy, NCLscan was applied to distinguishing between trans-spliced, circular and fusion transcripts on the basis of poly(A)- and nonpoly(A)-selected RNA-seq data. We showed that circular RNAs were expressed more ubiquitously, more abundantly and less cell type-specifically than trans-spliced and fusion transcripts. Our study thus describes a robust pipeline for the discovery of NCL transcripts, and sheds light on the fundamental biology of these non-canonical RNA events in human transcriptome.
format Online
Article
Text
id pubmed-4756807
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-47568072016-02-18 NCLscan: accurate identification of non-co-linear transcripts (fusion, trans-splicing and circular RNA) with a good balance between sensitivity and precision Chuang, Trees-Juen Wu, Chan-Shuo Chen, Chia-Ying Hung, Li-Yuan Chiang, Tai-Wei Yang, Min-Yu Nucleic Acids Res Methods Online Analysis of RNA-seq data often detects numerous ‘non-co-linear’ (NCL) transcripts, which comprised sequence segments that are topologically inconsistent with their corresponding DNA sequences in the reference genome. However, detection of NCL transcripts involves two major challenges: removal of false positives arising from alignment artifacts and discrimination between different types of NCL transcripts (trans-spliced, circular or fusion transcripts). Here, we developed a new NCL-transcript-detecting method (‘NCLscan’), which utilized a stepwise alignment strategy to almost completely eliminate false calls (>98% precision) without sacrificing true positives, enabling NCLscan outperform 18 other publicly-available tools (including fusion- and circular-RNA-detecting tools) in terms of sensitivity and precision, regardless of the generation strategy of simulated dataset, type of intragenic or intergenic NCL event, read depth of coverage, read length or expression level of NCL transcript. With the high accuracy, NCLscan was applied to distinguishing between trans-spliced, circular and fusion transcripts on the basis of poly(A)- and nonpoly(A)-selected RNA-seq data. We showed that circular RNAs were expressed more ubiquitously, more abundantly and less cell type-specifically than trans-spliced and fusion transcripts. Our study thus describes a robust pipeline for the discovery of NCL transcripts, and sheds light on the fundamental biology of these non-canonical RNA events in human transcriptome. Oxford University Press 2016-02-18 2015-10-05 /pmc/articles/PMC4756807/ /pubmed/26442529 http://dx.doi.org/10.1093/nar/gkv1013 Text en © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Methods Online
Chuang, Trees-Juen
Wu, Chan-Shuo
Chen, Chia-Ying
Hung, Li-Yuan
Chiang, Tai-Wei
Yang, Min-Yu
NCLscan: accurate identification of non-co-linear transcripts (fusion, trans-splicing and circular RNA) with a good balance between sensitivity and precision
title NCLscan: accurate identification of non-co-linear transcripts (fusion, trans-splicing and circular RNA) with a good balance between sensitivity and precision
title_full NCLscan: accurate identification of non-co-linear transcripts (fusion, trans-splicing and circular RNA) with a good balance between sensitivity and precision
title_fullStr NCLscan: accurate identification of non-co-linear transcripts (fusion, trans-splicing and circular RNA) with a good balance between sensitivity and precision
title_full_unstemmed NCLscan: accurate identification of non-co-linear transcripts (fusion, trans-splicing and circular RNA) with a good balance between sensitivity and precision
title_short NCLscan: accurate identification of non-co-linear transcripts (fusion, trans-splicing and circular RNA) with a good balance between sensitivity and precision
title_sort nclscan: accurate identification of non-co-linear transcripts (fusion, trans-splicing and circular rna) with a good balance between sensitivity and precision
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4756807/
https://www.ncbi.nlm.nih.gov/pubmed/26442529
http://dx.doi.org/10.1093/nar/gkv1013
work_keys_str_mv AT chuangtreesjuen nclscanaccurateidentificationofnoncolineartranscriptsfusiontranssplicingandcircularrnawithagoodbalancebetweensensitivityandprecision
AT wuchanshuo nclscanaccurateidentificationofnoncolineartranscriptsfusiontranssplicingandcircularrnawithagoodbalancebetweensensitivityandprecision
AT chenchiaying nclscanaccurateidentificationofnoncolineartranscriptsfusiontranssplicingandcircularrnawithagoodbalancebetweensensitivityandprecision
AT hungliyuan nclscanaccurateidentificationofnoncolineartranscriptsfusiontranssplicingandcircularrnawithagoodbalancebetweensensitivityandprecision
AT chiangtaiwei nclscanaccurateidentificationofnoncolineartranscriptsfusiontranssplicingandcircularrnawithagoodbalancebetweensensitivityandprecision
AT yangminyu nclscanaccurateidentificationofnoncolineartranscriptsfusiontranssplicingandcircularrnawithagoodbalancebetweensensitivityandprecision