Cargando…
Snaptron: querying splicing patterns across tens of thousands of RNA-seq samples
MOTIVATION: As more and larger genomics studies appear, there is a growing need for comprehensive and queryable cross-study summaries. These enable researchers to leverage vast datasets that would otherwise be difficult to obtain. RESULTS: Snaptron is a search engine for summarized RNA sequencing da...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5870547/ https://www.ncbi.nlm.nih.gov/pubmed/28968689 http://dx.doi.org/10.1093/bioinformatics/btx547 |
_version_ | 1783309506879422464 |
---|---|
author | Wilks, Christopher Gaddipati, Phani Nellore, Abhinav Langmead, Ben |
author_facet | Wilks, Christopher Gaddipati, Phani Nellore, Abhinav Langmead, Ben |
author_sort | Wilks, Christopher |
collection | PubMed |
description | MOTIVATION: As more and larger genomics studies appear, there is a growing need for comprehensive and queryable cross-study summaries. These enable researchers to leverage vast datasets that would otherwise be difficult to obtain. RESULTS: Snaptron is a search engine for summarized RNA sequencing data with a query planner that leverages R-tree, B-tree and inverted indexing strategies to rapidly execute queries over 146 million exon-exon splice junctions from over 70 000 human RNA-seq samples. Queries can be tailored by constraining which junctions and samples to consider. Snaptron can score junctions according to tissue specificity or other criteria, and can score samples according to the relative frequency of different splicing patterns. We describe the software and outline biological questions that can be explored with Snaptron queries. AVAILABILITY AND IMPLEMENTATION: Documentation is at http://snaptron.cs.jhu.edu. Source code is at https://github.com/ChristopherWilks/snaptron and https://github.com/ChristopherWilks/snaptron-experiments with a CC BY-NC 4.0 license. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-5870547 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-58705472018-04-05 Snaptron: querying splicing patterns across tens of thousands of RNA-seq samples Wilks, Christopher Gaddipati, Phani Nellore, Abhinav Langmead, Ben Bioinformatics Applications Notes MOTIVATION: As more and larger genomics studies appear, there is a growing need for comprehensive and queryable cross-study summaries. These enable researchers to leverage vast datasets that would otherwise be difficult to obtain. RESULTS: Snaptron is a search engine for summarized RNA sequencing data with a query planner that leverages R-tree, B-tree and inverted indexing strategies to rapidly execute queries over 146 million exon-exon splice junctions from over 70 000 human RNA-seq samples. Queries can be tailored by constraining which junctions and samples to consider. Snaptron can score junctions according to tissue specificity or other criteria, and can score samples according to the relative frequency of different splicing patterns. We describe the software and outline biological questions that can be explored with Snaptron queries. AVAILABILITY AND IMPLEMENTATION: Documentation is at http://snaptron.cs.jhu.edu. Source code is at https://github.com/ChristopherWilks/snaptron and https://github.com/ChristopherWilks/snaptron-experiments with a CC BY-NC 4.0 license. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2018-01-01 2017-09-01 /pmc/articles/PMC5870547/ /pubmed/28968689 http://dx.doi.org/10.1093/bioinformatics/btx547 Text en © The Author 2017. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Applications Notes Wilks, Christopher Gaddipati, Phani Nellore, Abhinav Langmead, Ben Snaptron: querying splicing patterns across tens of thousands of RNA-seq samples |
title | Snaptron: querying splicing patterns across tens of thousands of RNA-seq samples |
title_full | Snaptron: querying splicing patterns across tens of thousands of RNA-seq samples |
title_fullStr | Snaptron: querying splicing patterns across tens of thousands of RNA-seq samples |
title_full_unstemmed | Snaptron: querying splicing patterns across tens of thousands of RNA-seq samples |
title_short | Snaptron: querying splicing patterns across tens of thousands of RNA-seq samples |
title_sort | snaptron: querying splicing patterns across tens of thousands of rna-seq samples |
topic | Applications Notes |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5870547/ https://www.ncbi.nlm.nih.gov/pubmed/28968689 http://dx.doi.org/10.1093/bioinformatics/btx547 |
work_keys_str_mv | AT wilkschristopher snaptronqueryingsplicingpatternsacrosstensofthousandsofrnaseqsamples AT gaddipatiphani snaptronqueryingsplicingpatternsacrosstensofthousandsofrnaseqsamples AT nelloreabhinav snaptronqueryingsplicingpatternsacrosstensofthousandsofrnaseqsamples AT langmeadben snaptronqueryingsplicingpatternsacrosstensofthousandsofrnaseqsamples |