Cargando…

ReCount: A multi-experiment resource of analysis-ready RNA-seq gene count datasets

1 BACKGROUND: RNA sequencing is a flexible and powerful new approach for measuring gene, exon, or isoform expression. To maximize the utility of RNA sequencing data, new statistical methods are needed for clustering, differential expression, and other analyses. A major barrier to the development of...

Descripción completa

Detalles Bibliográficos
Autores principales: Frazee, Alyssa C, Langmead, Ben, Leek, Jeffrey T
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3229291/
https://www.ncbi.nlm.nih.gov/pubmed/22087737
http://dx.doi.org/10.1186/1471-2105-12-449
_version_ 1782217933603209216
author Frazee, Alyssa C
Langmead, Ben
Leek, Jeffrey T
author_facet Frazee, Alyssa C
Langmead, Ben
Leek, Jeffrey T
author_sort Frazee, Alyssa C
collection PubMed
description 1 BACKGROUND: RNA sequencing is a flexible and powerful new approach for measuring gene, exon, or isoform expression. To maximize the utility of RNA sequencing data, new statistical methods are needed for clustering, differential expression, and other analyses. A major barrier to the development of new statistical methods is the lack of RNA sequencing datasets that can be easily obtained and analyzed in common statistical software packages such as R. To speed up the development process, we have created a resource of analysis-ready RNA-sequencing datasets. 2 DESCRIPTION: ReCount is an online resource of RNA-seq gene count tables and auxilliary data. Tables were built from raw RNA sequencing data from 18 different published studies comprising 475 samples and over 8 billion reads. Using the Myrna package, reads were aligned, overlapped with gene models and tabulated into gene-by-sample count tables that are ready for statistical analysis. Count tables and phenotype data were combined into Bioconductor ExpressionSet objects for ease of analysis. ReCount also contains the Myrna manifest files and R source code used to process the samples, allowing statistical and computational scientists to consider alternative parameter values. 3 CONCLUSIONS: By combining datasets from many studies and providing data that has already been processed from. fastq format into ready-to-use. RData and. txt files, ReCount facilitates analysis and methods development for RNA-seq count data. We anticipate that ReCount will also be useful for investigators who wish to consider cross-study comparisons and alternative normalization strategies for RNA-seq.
format Online
Article
Text
id pubmed-3229291
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-32292912011-12-03 ReCount: A multi-experiment resource of analysis-ready RNA-seq gene count datasets Frazee, Alyssa C Langmead, Ben Leek, Jeffrey T BMC Bioinformatics Database 1 BACKGROUND: RNA sequencing is a flexible and powerful new approach for measuring gene, exon, or isoform expression. To maximize the utility of RNA sequencing data, new statistical methods are needed for clustering, differential expression, and other analyses. A major barrier to the development of new statistical methods is the lack of RNA sequencing datasets that can be easily obtained and analyzed in common statistical software packages such as R. To speed up the development process, we have created a resource of analysis-ready RNA-sequencing datasets. 2 DESCRIPTION: ReCount is an online resource of RNA-seq gene count tables and auxilliary data. Tables were built from raw RNA sequencing data from 18 different published studies comprising 475 samples and over 8 billion reads. Using the Myrna package, reads were aligned, overlapped with gene models and tabulated into gene-by-sample count tables that are ready for statistical analysis. Count tables and phenotype data were combined into Bioconductor ExpressionSet objects for ease of analysis. ReCount also contains the Myrna manifest files and R source code used to process the samples, allowing statistical and computational scientists to consider alternative parameter values. 3 CONCLUSIONS: By combining datasets from many studies and providing data that has already been processed from. fastq format into ready-to-use. RData and. txt files, ReCount facilitates analysis and methods development for RNA-seq count data. We anticipate that ReCount will also be useful for investigators who wish to consider cross-study comparisons and alternative normalization strategies for RNA-seq. BioMed Central 2011-11-16 /pmc/articles/PMC3229291/ /pubmed/22087737 http://dx.doi.org/10.1186/1471-2105-12-449 Text en Copyright ©2011 Frazee et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Database
Frazee, Alyssa C
Langmead, Ben
Leek, Jeffrey T
ReCount: A multi-experiment resource of analysis-ready RNA-seq gene count datasets
title ReCount: A multi-experiment resource of analysis-ready RNA-seq gene count datasets
title_full ReCount: A multi-experiment resource of analysis-ready RNA-seq gene count datasets
title_fullStr ReCount: A multi-experiment resource of analysis-ready RNA-seq gene count datasets
title_full_unstemmed ReCount: A multi-experiment resource of analysis-ready RNA-seq gene count datasets
title_short ReCount: A multi-experiment resource of analysis-ready RNA-seq gene count datasets
title_sort recount: a multi-experiment resource of analysis-ready rna-seq gene count datasets
topic Database
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3229291/
https://www.ncbi.nlm.nih.gov/pubmed/22087737
http://dx.doi.org/10.1186/1471-2105-12-449
work_keys_str_mv AT frazeealyssac recountamultiexperimentresourceofanalysisreadyrnaseqgenecountdatasets
AT langmeadben recountamultiexperimentresourceofanalysisreadyrnaseqgenecountdatasets
AT leekjeffreyt recountamultiexperimentresourceofanalysisreadyrnaseqgenecountdatasets