Cargando…

Snaptron: querying splicing patterns across tens of thousands of RNA-seq samples

MOTIVATION: As more and larger genomics studies appear, there is a growing need for comprehensive and queryable cross-study summaries. These enable researchers to leverage vast datasets that would otherwise be difficult to obtain. RESULTS: Snaptron is a search engine for summarized RNA sequencing da...

Descripción completa

Detalles Bibliográficos
Autores principales: Wilks, Christopher, Gaddipati, Phani, Nellore, Abhinav, Langmead, Ben
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5870547/
https://www.ncbi.nlm.nih.gov/pubmed/28968689
http://dx.doi.org/10.1093/bioinformatics/btx547
_version_ 1783309506879422464
author Wilks, Christopher
Gaddipati, Phani
Nellore, Abhinav
Langmead, Ben
author_facet Wilks, Christopher
Gaddipati, Phani
Nellore, Abhinav
Langmead, Ben
author_sort Wilks, Christopher
collection PubMed
description MOTIVATION: As more and larger genomics studies appear, there is a growing need for comprehensive and queryable cross-study summaries. These enable researchers to leverage vast datasets that would otherwise be difficult to obtain. RESULTS: Snaptron is a search engine for summarized RNA sequencing data with a query planner that leverages R-tree, B-tree and inverted indexing strategies to rapidly execute queries over 146 million exon-exon splice junctions from over 70 000 human RNA-seq samples. Queries can be tailored by constraining which junctions and samples to consider. Snaptron can score junctions according to tissue specificity or other criteria, and can score samples according to the relative frequency of different splicing patterns. We describe the software and outline biological questions that can be explored with Snaptron queries. AVAILABILITY AND IMPLEMENTATION: Documentation is at http://snaptron.cs.jhu.edu. Source code is at https://github.com/ChristopherWilks/snaptron and https://github.com/ChristopherWilks/snaptron-experiments with a CC BY-NC 4.0 license. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-5870547
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-58705472018-04-05 Snaptron: querying splicing patterns across tens of thousands of RNA-seq samples Wilks, Christopher Gaddipati, Phani Nellore, Abhinav Langmead, Ben Bioinformatics Applications Notes MOTIVATION: As more and larger genomics studies appear, there is a growing need for comprehensive and queryable cross-study summaries. These enable researchers to leverage vast datasets that would otherwise be difficult to obtain. RESULTS: Snaptron is a search engine for summarized RNA sequencing data with a query planner that leverages R-tree, B-tree and inverted indexing strategies to rapidly execute queries over 146 million exon-exon splice junctions from over 70 000 human RNA-seq samples. Queries can be tailored by constraining which junctions and samples to consider. Snaptron can score junctions according to tissue specificity or other criteria, and can score samples according to the relative frequency of different splicing patterns. We describe the software and outline biological questions that can be explored with Snaptron queries. AVAILABILITY AND IMPLEMENTATION: Documentation is at http://snaptron.cs.jhu.edu. Source code is at https://github.com/ChristopherWilks/snaptron and https://github.com/ChristopherWilks/snaptron-experiments with a CC BY-NC 4.0 license. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2018-01-01 2017-09-01 /pmc/articles/PMC5870547/ /pubmed/28968689 http://dx.doi.org/10.1093/bioinformatics/btx547 Text en © The Author 2017. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Applications Notes
Wilks, Christopher
Gaddipati, Phani
Nellore, Abhinav
Langmead, Ben
Snaptron: querying splicing patterns across tens of thousands of RNA-seq samples
title Snaptron: querying splicing patterns across tens of thousands of RNA-seq samples
title_full Snaptron: querying splicing patterns across tens of thousands of RNA-seq samples
title_fullStr Snaptron: querying splicing patterns across tens of thousands of RNA-seq samples
title_full_unstemmed Snaptron: querying splicing patterns across tens of thousands of RNA-seq samples
title_short Snaptron: querying splicing patterns across tens of thousands of RNA-seq samples
title_sort snaptron: querying splicing patterns across tens of thousands of rna-seq samples
topic Applications Notes
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5870547/
https://www.ncbi.nlm.nih.gov/pubmed/28968689
http://dx.doi.org/10.1093/bioinformatics/btx547
work_keys_str_mv AT wilkschristopher snaptronqueryingsplicingpatternsacrosstensofthousandsofrnaseqsamples
AT gaddipatiphani snaptronqueryingsplicingpatternsacrosstensofthousandsofrnaseqsamples
AT nelloreabhinav snaptronqueryingsplicingpatternsacrosstensofthousandsofrnaseqsamples
AT langmeadben snaptronqueryingsplicingpatternsacrosstensofthousandsofrnaseqsamples