Cargando…

RSEQtools: a modular framework to analyze RNA-Seq data using compact, anonymized data summaries

Summary: The advent of next-generation sequencing for functional genomics has given rise to quantities of sequence information that are often so large that they are difficult to handle. Moreover, sequence reads from a specific individual can contain sufficient information to potentially identify and...

Descripción completa

Detalles Bibliográficos
Autores principales: Habegger, Lukas, Sboner, Andrea, Gianoulis, Tara A., Rozowsky, Joel, Agarwal, Ashish, Snyder, Michael, Gerstein, Mark
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3018817/
https://www.ncbi.nlm.nih.gov/pubmed/21134889
http://dx.doi.org/10.1093/bioinformatics/btq643
_version_ 1782196122580680704
author Habegger, Lukas
Sboner, Andrea
Gianoulis, Tara A.
Rozowsky, Joel
Agarwal, Ashish
Snyder, Michael
Gerstein, Mark
author_facet Habegger, Lukas
Sboner, Andrea
Gianoulis, Tara A.
Rozowsky, Joel
Agarwal, Ashish
Snyder, Michael
Gerstein, Mark
author_sort Habegger, Lukas
collection PubMed
description Summary: The advent of next-generation sequencing for functional genomics has given rise to quantities of sequence information that are often so large that they are difficult to handle. Moreover, sequence reads from a specific individual can contain sufficient information to potentially identify and genetically characterize that person, raising privacy concerns. In order to address these issues, we have developed the Mapped Read Format (MRF), a compact data summary format for both short and long read alignments that enables the anonymization of confidential sequence information, while allowing one to still carry out many functional genomics studies. We have developed a suite of tools (RSEQtools) that use this format for the analysis of RNA-Seq experiments. These tools consist of a set of modules that perform common tasks such as calculating gene expression values, generating signal tracks of mapped reads and segmenting that signal into actively transcribed regions. Moreover, the tools can readily be used to build customizable RNA-Seq workflows. In addition to the anonymization afforded by MRF, this format also facilitates the decoupling of the alignment of reads from downstream analyses. Availability and implementation: RSEQtools is implemented in C and the source code is available at http://rseqtools.gersteinlab.org/. Contact: lukas.habegger@yale.edu; mark.gerstein@yale.edu Supplementary information: Supplementary data are available at Bioinformatics online.
format Text
id pubmed-3018817
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-30188172011-01-12 RSEQtools: a modular framework to analyze RNA-Seq data using compact, anonymized data summaries Habegger, Lukas Sboner, Andrea Gianoulis, Tara A. Rozowsky, Joel Agarwal, Ashish Snyder, Michael Gerstein, Mark Bioinformatics Applications Note Summary: The advent of next-generation sequencing for functional genomics has given rise to quantities of sequence information that are often so large that they are difficult to handle. Moreover, sequence reads from a specific individual can contain sufficient information to potentially identify and genetically characterize that person, raising privacy concerns. In order to address these issues, we have developed the Mapped Read Format (MRF), a compact data summary format for both short and long read alignments that enables the anonymization of confidential sequence information, while allowing one to still carry out many functional genomics studies. We have developed a suite of tools (RSEQtools) that use this format for the analysis of RNA-Seq experiments. These tools consist of a set of modules that perform common tasks such as calculating gene expression values, generating signal tracks of mapped reads and segmenting that signal into actively transcribed regions. Moreover, the tools can readily be used to build customizable RNA-Seq workflows. In addition to the anonymization afforded by MRF, this format also facilitates the decoupling of the alignment of reads from downstream analyses. Availability and implementation: RSEQtools is implemented in C and the source code is available at http://rseqtools.gersteinlab.org/. Contact: lukas.habegger@yale.edu; mark.gerstein@yale.edu Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2011-01-15 2010-12-05 /pmc/articles/PMC3018817/ /pubmed/21134889 http://dx.doi.org/10.1093/bioinformatics/btq643 Text en © The Author(s) 2010. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Applications Note
Habegger, Lukas
Sboner, Andrea
Gianoulis, Tara A.
Rozowsky, Joel
Agarwal, Ashish
Snyder, Michael
Gerstein, Mark
RSEQtools: a modular framework to analyze RNA-Seq data using compact, anonymized data summaries
title RSEQtools: a modular framework to analyze RNA-Seq data using compact, anonymized data summaries
title_full RSEQtools: a modular framework to analyze RNA-Seq data using compact, anonymized data summaries
title_fullStr RSEQtools: a modular framework to analyze RNA-Seq data using compact, anonymized data summaries
title_full_unstemmed RSEQtools: a modular framework to analyze RNA-Seq data using compact, anonymized data summaries
title_short RSEQtools: a modular framework to analyze RNA-Seq data using compact, anonymized data summaries
title_sort rseqtools: a modular framework to analyze rna-seq data using compact, anonymized data summaries
topic Applications Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3018817/
https://www.ncbi.nlm.nih.gov/pubmed/21134889
http://dx.doi.org/10.1093/bioinformatics/btq643
work_keys_str_mv AT habeggerlukas rseqtoolsamodularframeworktoanalyzernaseqdatausingcompactanonymizeddatasummaries
AT sbonerandrea rseqtoolsamodularframeworktoanalyzernaseqdatausingcompactanonymizeddatasummaries
AT gianoulistaraa rseqtoolsamodularframeworktoanalyzernaseqdatausingcompactanonymizeddatasummaries
AT rozowskyjoel rseqtoolsamodularframeworktoanalyzernaseqdatausingcompactanonymizeddatasummaries
AT agarwalashish rseqtoolsamodularframeworktoanalyzernaseqdatausingcompactanonymizeddatasummaries
AT snydermichael rseqtoolsamodularframeworktoanalyzernaseqdatausingcompactanonymizeddatasummaries
AT gersteinmark rseqtoolsamodularframeworktoanalyzernaseqdatausingcompactanonymizeddatasummaries