Cargando…

Open-access synthetic spike-in mRNA-seq data for cancer gene fusions

BACKGROUND: Oncogenic fusion genes underlie the mechanism of several common cancers. Next-generation sequencing based RNA-seq analyses have revealed an increasing number of recurrent fusions in a variety of cancers. However, absence of a publicly available gene-fusion focused RNA-seq data impedes co...

Descripción completa

Detalles Bibliográficos
Autores principales: Tembe, Waibhav D, Pond, Stephanie JK, Legendre, Christophe, Chuang, Han-Yu, Liang, Winnie S, Kim, Nancy E, Montel, Valerie, Wong, Shukmei, McDaniel, Timothy K, Craig, David W, Carpten, John D
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4190330/
https://www.ncbi.nlm.nih.gov/pubmed/25266161
http://dx.doi.org/10.1186/1471-2164-15-824
_version_ 1782338486037118976
author Tembe, Waibhav D
Pond, Stephanie JK
Legendre, Christophe
Chuang, Han-Yu
Liang, Winnie S
Kim, Nancy E
Montel, Valerie
Wong, Shukmei
McDaniel, Timothy K
Craig, David W
Carpten, John D
author_facet Tembe, Waibhav D
Pond, Stephanie JK
Legendre, Christophe
Chuang, Han-Yu
Liang, Winnie S
Kim, Nancy E
Montel, Valerie
Wong, Shukmei
McDaniel, Timothy K
Craig, David W
Carpten, John D
author_sort Tembe, Waibhav D
collection PubMed
description BACKGROUND: Oncogenic fusion genes underlie the mechanism of several common cancers. Next-generation sequencing based RNA-seq analyses have revealed an increasing number of recurrent fusions in a variety of cancers. However, absence of a publicly available gene-fusion focused RNA-seq data impedes comparative assessment and collaborative development of novel gene fusions detection algorithms. We have generated nine synthetic poly-adenylated RNA transcripts that correspond to previously reported oncogenic gene fusions. These synthetic RNAs were spiked at known molarity over a wide range into total RNA prior to construction of next-generation sequencing mRNA libraries to generate RNA-seq data. RESULTS: Leveraging a priori knowledge about replicates and molarity of each synthetic fusion transcript, we demonstrate utility of this dataset to compare multiple gene fusion algorithms’ detection ability. In general, more fusions are detected at higher molarity, indicating that our constructs performed as expected. However, systematic detection differences are observed based on molarity or algorithm-specific characteristics. Fusion-sequence specific detection differences indicate that for applications where specific sequences are being investigated, additional constructs may be added to provide quantitative data that is specific for the sequence of interest. CONCLUSIONS: To our knowledge, this is the first publicly available synthetic RNA-seq data that specifically leverages known cancer gene-fusions. The proposed method of designing multiple gene-fusion constructs over a wide range of molarity allows granular performance analyses of multiple fusion-detection algorithms. The community can leverage and augment this publicly available data to further collaborative development of analytical tools and performance assessment frameworks for gene fusions from next-generation sequencing data. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/1471-2164-15-824) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4190330
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-41903302014-10-10 Open-access synthetic spike-in mRNA-seq data for cancer gene fusions Tembe, Waibhav D Pond, Stephanie JK Legendre, Christophe Chuang, Han-Yu Liang, Winnie S Kim, Nancy E Montel, Valerie Wong, Shukmei McDaniel, Timothy K Craig, David W Carpten, John D BMC Genomics Methodology Article BACKGROUND: Oncogenic fusion genes underlie the mechanism of several common cancers. Next-generation sequencing based RNA-seq analyses have revealed an increasing number of recurrent fusions in a variety of cancers. However, absence of a publicly available gene-fusion focused RNA-seq data impedes comparative assessment and collaborative development of novel gene fusions detection algorithms. We have generated nine synthetic poly-adenylated RNA transcripts that correspond to previously reported oncogenic gene fusions. These synthetic RNAs were spiked at known molarity over a wide range into total RNA prior to construction of next-generation sequencing mRNA libraries to generate RNA-seq data. RESULTS: Leveraging a priori knowledge about replicates and molarity of each synthetic fusion transcript, we demonstrate utility of this dataset to compare multiple gene fusion algorithms’ detection ability. In general, more fusions are detected at higher molarity, indicating that our constructs performed as expected. However, systematic detection differences are observed based on molarity or algorithm-specific characteristics. Fusion-sequence specific detection differences indicate that for applications where specific sequences are being investigated, additional constructs may be added to provide quantitative data that is specific for the sequence of interest. CONCLUSIONS: To our knowledge, this is the first publicly available synthetic RNA-seq data that specifically leverages known cancer gene-fusions. The proposed method of designing multiple gene-fusion constructs over a wide range of molarity allows granular performance analyses of multiple fusion-detection algorithms. The community can leverage and augment this publicly available data to further collaborative development of analytical tools and performance assessment frameworks for gene fusions from next-generation sequencing data. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/1471-2164-15-824) contains supplementary material, which is available to authorized users. BioMed Central 2014-09-30 /pmc/articles/PMC4190330/ /pubmed/25266161 http://dx.doi.org/10.1186/1471-2164-15-824 Text en © Tembe et al.; licensee BioMed Central Ltd. 2014 This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Tembe, Waibhav D
Pond, Stephanie JK
Legendre, Christophe
Chuang, Han-Yu
Liang, Winnie S
Kim, Nancy E
Montel, Valerie
Wong, Shukmei
McDaniel, Timothy K
Craig, David W
Carpten, John D
Open-access synthetic spike-in mRNA-seq data for cancer gene fusions
title Open-access synthetic spike-in mRNA-seq data for cancer gene fusions
title_full Open-access synthetic spike-in mRNA-seq data for cancer gene fusions
title_fullStr Open-access synthetic spike-in mRNA-seq data for cancer gene fusions
title_full_unstemmed Open-access synthetic spike-in mRNA-seq data for cancer gene fusions
title_short Open-access synthetic spike-in mRNA-seq data for cancer gene fusions
title_sort open-access synthetic spike-in mrna-seq data for cancer gene fusions
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4190330/
https://www.ncbi.nlm.nih.gov/pubmed/25266161
http://dx.doi.org/10.1186/1471-2164-15-824
work_keys_str_mv AT tembewaibhavd openaccesssyntheticspikeinmrnaseqdataforcancergenefusions
AT pondstephaniejk openaccesssyntheticspikeinmrnaseqdataforcancergenefusions
AT legendrechristophe openaccesssyntheticspikeinmrnaseqdataforcancergenefusions
AT chuanghanyu openaccesssyntheticspikeinmrnaseqdataforcancergenefusions
AT liangwinnies openaccesssyntheticspikeinmrnaseqdataforcancergenefusions
AT kimnancye openaccesssyntheticspikeinmrnaseqdataforcancergenefusions
AT montelvalerie openaccesssyntheticspikeinmrnaseqdataforcancergenefusions
AT wongshukmei openaccesssyntheticspikeinmrnaseqdataforcancergenefusions
AT mcdanieltimothyk openaccesssyntheticspikeinmrnaseqdataforcancergenefusions
AT craigdavidw openaccesssyntheticspikeinmrnaseqdataforcancergenefusions
AT carptenjohnd openaccesssyntheticspikeinmrnaseqdataforcancergenefusions