Cargando…

Fast Search of Thousands of Short-Read Sequencing Experiments

We introduce Sequence Bloom Trees, a method for querying thousands of short-read sequencing experiments by sequence 485 times faster than existing approaches. The approach searches large data archives for all experiments that involve a given sequence. We use Sequence Bloom Trees to search 2652 human...

Descripción completa

Detalles Bibliográficos
Autores principales: Solomon, Brad, Kingsford, Carl
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4804353/
https://www.ncbi.nlm.nih.gov/pubmed/26854477
http://dx.doi.org/10.1038/nbt.3442
Descripción
Sumario:We introduce Sequence Bloom Trees, a method for querying thousands of short-read sequencing experiments by sequence 485 times faster than existing approaches. The approach searches large data archives for all experiments that involve a given sequence. We use Sequence Bloom Trees to search 2652 human blood, breast, and brain RNA-seq experiments for all 214,293 known transcripts in under 4 days using less than 239 MB of RAM and a single CPU.