Cargando…

A fast detection of fusion genes from paired-end RNA-seq data

BACKGROUND: Fusion genes are known to be drivers of many common cancers, so they are potential markers for diagnosis, prognosis or therapy response. The advent of paired-end RNA sequencing enhances our ability to discover fusion genes. While there are available methods, routine analyses of large num...

Descripción completa

Detalles Bibliográficos
Autores principales: Vu, Trung Nghia, Deng, Wenjiang, Trac, Quang Thinh, Calza, Stefano, Hwang, Woochang, Pawitan, Yudi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6211471/
https://www.ncbi.nlm.nih.gov/pubmed/30382840
http://dx.doi.org/10.1186/s12864-018-5156-1
_version_ 1783367340370427904
author Vu, Trung Nghia
Deng, Wenjiang
Trac, Quang Thinh
Calza, Stefano
Hwang, Woochang
Pawitan, Yudi
author_facet Vu, Trung Nghia
Deng, Wenjiang
Trac, Quang Thinh
Calza, Stefano
Hwang, Woochang
Pawitan, Yudi
author_sort Vu, Trung Nghia
collection PubMed
description BACKGROUND: Fusion genes are known to be drivers of many common cancers, so they are potential markers for diagnosis, prognosis or therapy response. The advent of paired-end RNA sequencing enhances our ability to discover fusion genes. While there are available methods, routine analyses of large number of samples are still limited due to high computational demands. RESULTS: We develop FuSeq, a fast and accurate method to discover fusion genes based on quasi-mapping to quickly map the reads, extract initial candidates from split reads and fusion equivalence classes of mapped reads, and finally apply multiple filters and statistical tests to get the final candidates. We apply FuSeq to four validated datasets: breast cancer, melanoma and glioma datasets, and one spike-in dataset. The results reveal high sensitivity and specificity in all datasets, and compare well against other methods such as FusionMap, TRUP, TopHat-Fusion, SOAPfuse and JAFFA. In terms of computational time, FuSeq is two-fold faster than FusionMap and orders of magnitude faster than the other methods. CONCLUSIONS: With this advantage of less computational demands, FuSeq makes it practical to investigate fusion genes in large numbers of samples. FuSeq is implemented in C++ and R, and available at https://github.com/nghiavtr/FuSeqfor non-commercial uses. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-018-5156-1) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6211471
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-62114712018-11-08 A fast detection of fusion genes from paired-end RNA-seq data Vu, Trung Nghia Deng, Wenjiang Trac, Quang Thinh Calza, Stefano Hwang, Woochang Pawitan, Yudi BMC Genomics Methodology Article BACKGROUND: Fusion genes are known to be drivers of many common cancers, so they are potential markers for diagnosis, prognosis or therapy response. The advent of paired-end RNA sequencing enhances our ability to discover fusion genes. While there are available methods, routine analyses of large number of samples are still limited due to high computational demands. RESULTS: We develop FuSeq, a fast and accurate method to discover fusion genes based on quasi-mapping to quickly map the reads, extract initial candidates from split reads and fusion equivalence classes of mapped reads, and finally apply multiple filters and statistical tests to get the final candidates. We apply FuSeq to four validated datasets: breast cancer, melanoma and glioma datasets, and one spike-in dataset. The results reveal high sensitivity and specificity in all datasets, and compare well against other methods such as FusionMap, TRUP, TopHat-Fusion, SOAPfuse and JAFFA. In terms of computational time, FuSeq is two-fold faster than FusionMap and orders of magnitude faster than the other methods. CONCLUSIONS: With this advantage of less computational demands, FuSeq makes it practical to investigate fusion genes in large numbers of samples. FuSeq is implemented in C++ and R, and available at https://github.com/nghiavtr/FuSeqfor non-commercial uses. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-018-5156-1) contains supplementary material, which is available to authorized users. BioMed Central 2018-11-01 /pmc/articles/PMC6211471/ /pubmed/30382840 http://dx.doi.org/10.1186/s12864-018-5156-1 Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Vu, Trung Nghia
Deng, Wenjiang
Trac, Quang Thinh
Calza, Stefano
Hwang, Woochang
Pawitan, Yudi
A fast detection of fusion genes from paired-end RNA-seq data
title A fast detection of fusion genes from paired-end RNA-seq data
title_full A fast detection of fusion genes from paired-end RNA-seq data
title_fullStr A fast detection of fusion genes from paired-end RNA-seq data
title_full_unstemmed A fast detection of fusion genes from paired-end RNA-seq data
title_short A fast detection of fusion genes from paired-end RNA-seq data
title_sort fast detection of fusion genes from paired-end rna-seq data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6211471/
https://www.ncbi.nlm.nih.gov/pubmed/30382840
http://dx.doi.org/10.1186/s12864-018-5156-1
work_keys_str_mv AT vutrungnghia afastdetectionoffusiongenesfrompairedendrnaseqdata
AT dengwenjiang afastdetectionoffusiongenesfrompairedendrnaseqdata
AT tracquangthinh afastdetectionoffusiongenesfrompairedendrnaseqdata
AT calzastefano afastdetectionoffusiongenesfrompairedendrnaseqdata
AT hwangwoochang afastdetectionoffusiongenesfrompairedendrnaseqdata
AT pawitanyudi afastdetectionoffusiongenesfrompairedendrnaseqdata
AT vutrungnghia fastdetectionoffusiongenesfrompairedendrnaseqdata
AT dengwenjiang fastdetectionoffusiongenesfrompairedendrnaseqdata
AT tracquangthinh fastdetectionoffusiongenesfrompairedendrnaseqdata
AT calzastefano fastdetectionoffusiongenesfrompairedendrnaseqdata
AT hwangwoochang fastdetectionoffusiongenesfrompairedendrnaseqdata
AT pawitanyudi fastdetectionoffusiongenesfrompairedendrnaseqdata