Cargando…

SeqOthello: querying RNA-seq experiments at scale

We present SeqOthello, an ultra-fast and memory-efficient indexing structure to support arbitrary sequence query against large collections of RNA-seq experiments. It takes SeqOthello only 5 min and 19.1 GB memory to conduct a global survey of 11,658 fusion events against 10,113 TCGA Pan-Cancer RNA-s...

Descripción completa

Detalles Bibliográficos
Autores principales: Yu, Ye, Liu, Jinpeng, Liu, Xinan, Zhang, Yi, Magner, Eamonn, Lehnert, Erik, Qian, Chen, Liu, Jinze
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6194578/
https://www.ncbi.nlm.nih.gov/pubmed/30340508
http://dx.doi.org/10.1186/s13059-018-1535-9
Descripción
Sumario:We present SeqOthello, an ultra-fast and memory-efficient indexing structure to support arbitrary sequence query against large collections of RNA-seq experiments. It takes SeqOthello only 5 min and 19.1 GB memory to conduct a global survey of 11,658 fusion events against 10,113 TCGA Pan-Cancer RNA-seq datasets. The query recovers 92.7% of tier-1 fusions curated by TCGA Fusion Gene Database and reveals 270 novel occurrences, all of which are present as tumor-specific. By providing a reference-free, alignment-free, and parameter-free sequence search system, SeqOthello will enable large-scale integrative studies using sequence-level data, an undertaking not previously practicable for many individual labs. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13059-018-1535-9) contains supplementary material, which is available to authorized users.