Cargando…
Open-access synthetic spike-in mRNA-seq data for cancer gene fusions
BACKGROUND: Oncogenic fusion genes underlie the mechanism of several common cancers. Next-generation sequencing based RNA-seq analyses have revealed an increasing number of recurrent fusions in a variety of cancers. However, absence of a publicly available gene-fusion focused RNA-seq data impedes co...
Autores principales: | , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4190330/ https://www.ncbi.nlm.nih.gov/pubmed/25266161 http://dx.doi.org/10.1186/1471-2164-15-824 |
_version_ | 1782338486037118976 |
---|---|
author | Tembe, Waibhav D Pond, Stephanie JK Legendre, Christophe Chuang, Han-Yu Liang, Winnie S Kim, Nancy E Montel, Valerie Wong, Shukmei McDaniel, Timothy K Craig, David W Carpten, John D |
author_facet | Tembe, Waibhav D Pond, Stephanie JK Legendre, Christophe Chuang, Han-Yu Liang, Winnie S Kim, Nancy E Montel, Valerie Wong, Shukmei McDaniel, Timothy K Craig, David W Carpten, John D |
author_sort | Tembe, Waibhav D |
collection | PubMed |
description | BACKGROUND: Oncogenic fusion genes underlie the mechanism of several common cancers. Next-generation sequencing based RNA-seq analyses have revealed an increasing number of recurrent fusions in a variety of cancers. However, absence of a publicly available gene-fusion focused RNA-seq data impedes comparative assessment and collaborative development of novel gene fusions detection algorithms. We have generated nine synthetic poly-adenylated RNA transcripts that correspond to previously reported oncogenic gene fusions. These synthetic RNAs were spiked at known molarity over a wide range into total RNA prior to construction of next-generation sequencing mRNA libraries to generate RNA-seq data. RESULTS: Leveraging a priori knowledge about replicates and molarity of each synthetic fusion transcript, we demonstrate utility of this dataset to compare multiple gene fusion algorithms’ detection ability. In general, more fusions are detected at higher molarity, indicating that our constructs performed as expected. However, systematic detection differences are observed based on molarity or algorithm-specific characteristics. Fusion-sequence specific detection differences indicate that for applications where specific sequences are being investigated, additional constructs may be added to provide quantitative data that is specific for the sequence of interest. CONCLUSIONS: To our knowledge, this is the first publicly available synthetic RNA-seq data that specifically leverages known cancer gene-fusions. The proposed method of designing multiple gene-fusion constructs over a wide range of molarity allows granular performance analyses of multiple fusion-detection algorithms. The community can leverage and augment this publicly available data to further collaborative development of analytical tools and performance assessment frameworks for gene fusions from next-generation sequencing data. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/1471-2164-15-824) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4190330 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-41903302014-10-10 Open-access synthetic spike-in mRNA-seq data for cancer gene fusions Tembe, Waibhav D Pond, Stephanie JK Legendre, Christophe Chuang, Han-Yu Liang, Winnie S Kim, Nancy E Montel, Valerie Wong, Shukmei McDaniel, Timothy K Craig, David W Carpten, John D BMC Genomics Methodology Article BACKGROUND: Oncogenic fusion genes underlie the mechanism of several common cancers. Next-generation sequencing based RNA-seq analyses have revealed an increasing number of recurrent fusions in a variety of cancers. However, absence of a publicly available gene-fusion focused RNA-seq data impedes comparative assessment and collaborative development of novel gene fusions detection algorithms. We have generated nine synthetic poly-adenylated RNA transcripts that correspond to previously reported oncogenic gene fusions. These synthetic RNAs were spiked at known molarity over a wide range into total RNA prior to construction of next-generation sequencing mRNA libraries to generate RNA-seq data. RESULTS: Leveraging a priori knowledge about replicates and molarity of each synthetic fusion transcript, we demonstrate utility of this dataset to compare multiple gene fusion algorithms’ detection ability. In general, more fusions are detected at higher molarity, indicating that our constructs performed as expected. However, systematic detection differences are observed based on molarity or algorithm-specific characteristics. Fusion-sequence specific detection differences indicate that for applications where specific sequences are being investigated, additional constructs may be added to provide quantitative data that is specific for the sequence of interest. CONCLUSIONS: To our knowledge, this is the first publicly available synthetic RNA-seq data that specifically leverages known cancer gene-fusions. The proposed method of designing multiple gene-fusion constructs over a wide range of molarity allows granular performance analyses of multiple fusion-detection algorithms. The community can leverage and augment this publicly available data to further collaborative development of analytical tools and performance assessment frameworks for gene fusions from next-generation sequencing data. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/1471-2164-15-824) contains supplementary material, which is available to authorized users. BioMed Central 2014-09-30 /pmc/articles/PMC4190330/ /pubmed/25266161 http://dx.doi.org/10.1186/1471-2164-15-824 Text en © Tembe et al.; licensee BioMed Central Ltd. 2014 This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Tembe, Waibhav D Pond, Stephanie JK Legendre, Christophe Chuang, Han-Yu Liang, Winnie S Kim, Nancy E Montel, Valerie Wong, Shukmei McDaniel, Timothy K Craig, David W Carpten, John D Open-access synthetic spike-in mRNA-seq data for cancer gene fusions |
title | Open-access synthetic spike-in mRNA-seq data for cancer gene fusions |
title_full | Open-access synthetic spike-in mRNA-seq data for cancer gene fusions |
title_fullStr | Open-access synthetic spike-in mRNA-seq data for cancer gene fusions |
title_full_unstemmed | Open-access synthetic spike-in mRNA-seq data for cancer gene fusions |
title_short | Open-access synthetic spike-in mRNA-seq data for cancer gene fusions |
title_sort | open-access synthetic spike-in mrna-seq data for cancer gene fusions |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4190330/ https://www.ncbi.nlm.nih.gov/pubmed/25266161 http://dx.doi.org/10.1186/1471-2164-15-824 |
work_keys_str_mv | AT tembewaibhavd openaccesssyntheticspikeinmrnaseqdataforcancergenefusions AT pondstephaniejk openaccesssyntheticspikeinmrnaseqdataforcancergenefusions AT legendrechristophe openaccesssyntheticspikeinmrnaseqdataforcancergenefusions AT chuanghanyu openaccesssyntheticspikeinmrnaseqdataforcancergenefusions AT liangwinnies openaccesssyntheticspikeinmrnaseqdataforcancergenefusions AT kimnancye openaccesssyntheticspikeinmrnaseqdataforcancergenefusions AT montelvalerie openaccesssyntheticspikeinmrnaseqdataforcancergenefusions AT wongshukmei openaccesssyntheticspikeinmrnaseqdataforcancergenefusions AT mcdanieltimothyk openaccesssyntheticspikeinmrnaseqdataforcancergenefusions AT craigdavidw openaccesssyntheticspikeinmrnaseqdataforcancergenefusions AT carptenjohnd openaccesssyntheticspikeinmrnaseqdataforcancergenefusions |