Cargando…

FusionScan: accurate prediction of fusion genes from RNA-Seq data

Identification of fusion gene is of prominent importance in cancer research field because of their potential as carcinogenic drivers. RNA sequencing (RNA-Seq) data have been the most useful source for identification of fusion transcripts. Although a number of algorithms have been developed thus far,...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, Pora, Jang, Ye Eun, Lee, Sanghyuk
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Korea Genome Organization 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6808644/
https://www.ncbi.nlm.nih.gov/pubmed/31610622
http://dx.doi.org/10.5808/GI.2019.17.3.e26
_version_ 1783461785115820032
author Kim, Pora
Jang, Ye Eun
Lee, Sanghyuk
author_facet Kim, Pora
Jang, Ye Eun
Lee, Sanghyuk
author_sort Kim, Pora
collection PubMed
description Identification of fusion gene is of prominent importance in cancer research field because of their potential as carcinogenic drivers. RNA sequencing (RNA-Seq) data have been the most useful source for identification of fusion transcripts. Although a number of algorithms have been developed thus far, most programs produce too many false-positives, thus making experimental confirmation almost impossible. We still lack a reliable program that achieves high precision with reasonable recall rate. Here, we present FusionScan, a highly optimized tool for predicting fusion transcripts from RNA-Seq data. We specifically search for split reads composed of intact exons at the fusion boundaries. Using 269 known fusion cases as the reference, we have implemented various mapping and filtering strategies to remove false-positives without discarding genuine fusions. In the performance test using three cell line datasets with validated fusion cases (NCI-H660, K562, and MCF-7), FusionScan outperformed other existing programs by a considerable margin, achieving the precision and recall rates of 60% and 79%, respectively. Simulation test also demonstrated that FusionScan recovered most of true positives without producing an overwhelming number of false-positives regardless of sequencing depth and read length. The computation time was comparable to other leading tools. We also provide several curative means to help users investigate the details of fusion candidates easily. We believe that FusionScan would be a reliable, efficient and convenient program for detecting fusion transcripts that meet the requirements in the clinical and experimental community. FusionScan is freely available at http://fusionscan.ewha.ac.kr/.
format Online
Article
Text
id pubmed-6808644
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Korea Genome Organization
record_format MEDLINE/PubMed
spelling pubmed-68086442019-10-24 FusionScan: accurate prediction of fusion genes from RNA-Seq data Kim, Pora Jang, Ye Eun Lee, Sanghyuk Genomics Inform Original Article Identification of fusion gene is of prominent importance in cancer research field because of their potential as carcinogenic drivers. RNA sequencing (RNA-Seq) data have been the most useful source for identification of fusion transcripts. Although a number of algorithms have been developed thus far, most programs produce too many false-positives, thus making experimental confirmation almost impossible. We still lack a reliable program that achieves high precision with reasonable recall rate. Here, we present FusionScan, a highly optimized tool for predicting fusion transcripts from RNA-Seq data. We specifically search for split reads composed of intact exons at the fusion boundaries. Using 269 known fusion cases as the reference, we have implemented various mapping and filtering strategies to remove false-positives without discarding genuine fusions. In the performance test using three cell line datasets with validated fusion cases (NCI-H660, K562, and MCF-7), FusionScan outperformed other existing programs by a considerable margin, achieving the precision and recall rates of 60% and 79%, respectively. Simulation test also demonstrated that FusionScan recovered most of true positives without producing an overwhelming number of false-positives regardless of sequencing depth and read length. The computation time was comparable to other leading tools. We also provide several curative means to help users investigate the details of fusion candidates easily. We believe that FusionScan would be a reliable, efficient and convenient program for detecting fusion transcripts that meet the requirements in the clinical and experimental community. FusionScan is freely available at http://fusionscan.ewha.ac.kr/. Korea Genome Organization 2019-07-23 /pmc/articles/PMC6808644/ /pubmed/31610622 http://dx.doi.org/10.5808/GI.2019.17.3.e26 Text en (c) 2019, Korea Genome Organization (CC) This is an open-access article distributed under the terms of the Creative Commons Attribution license(https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Article
Kim, Pora
Jang, Ye Eun
Lee, Sanghyuk
FusionScan: accurate prediction of fusion genes from RNA-Seq data
title FusionScan: accurate prediction of fusion genes from RNA-Seq data
title_full FusionScan: accurate prediction of fusion genes from RNA-Seq data
title_fullStr FusionScan: accurate prediction of fusion genes from RNA-Seq data
title_full_unstemmed FusionScan: accurate prediction of fusion genes from RNA-Seq data
title_short FusionScan: accurate prediction of fusion genes from RNA-Seq data
title_sort fusionscan: accurate prediction of fusion genes from rna-seq data
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6808644/
https://www.ncbi.nlm.nih.gov/pubmed/31610622
http://dx.doi.org/10.5808/GI.2019.17.3.e26
work_keys_str_mv AT kimpora fusionscanaccuratepredictionoffusiongenesfromrnaseqdata
AT jangyeeun fusionscanaccuratepredictionoffusiongenesfromrnaseqdata
AT leesanghyuk fusionscanaccuratepredictionoffusiongenesfromrnaseqdata