Cargando…

Digital expression explorer 2: a repository of uniformly processed RNA sequencing data

BACKGROUND: RNA sequencing (RNA-seq) is an indispensable tool in the study of gene regulation. While the technology has brought with it better transcript coverage and quantification, there remain considerable barriers to entry for the computational biologist to analyse large data sets. There is a re...

Descripción completa

Detalles Bibliográficos
Autores principales: Ziemann, Mark, Kaspi, Antony, El-Osta, Assam
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6446219/
https://www.ncbi.nlm.nih.gov/pubmed/30942868
http://dx.doi.org/10.1093/gigascience/giz022
_version_ 1783408321008500736
author Ziemann, Mark
Kaspi, Antony
El-Osta, Assam
author_facet Ziemann, Mark
Kaspi, Antony
El-Osta, Assam
author_sort Ziemann, Mark
collection PubMed
description BACKGROUND: RNA sequencing (RNA-seq) is an indispensable tool in the study of gene regulation. While the technology has brought with it better transcript coverage and quantification, there remain considerable barriers to entry for the computational biologist to analyse large data sets. There is a real need for a repository of uniformly processed RNA-seq data that is easy to use. FINDINGS: To address these obstacles, we developed Digital Expression Explorer 2 (DEE2), a web-based repository of RNA-seq data in the form of gene-level and transcript-level expression counts. DEE2 contains >5.3 trillion assigned reads from 580,000 RNA-seq data sets including species Escherichia coli, yeast, Arabidopsis, worm, fruit fly, zebrafish, rat, mouse, and human. Base-space sequence data downloaded from the National Center for Biotechnology Information Sequence Read Archive underwent quality control prior to transcriptome and genome mapping using open-source tools. Uniform data processing methods ensure consistency across experiments, facilitating fast and reproducible meta-analyses. CONCLUSIONS: The web interface allows users to quickly identify data sets of interest using accession number and keyword searches. The data can also be accessed programmatically using a specifically designed R package. We demonstrate that DEE2 data are compatible with statistical packages such as edgeR or DESeq. Bulk data are also available for download. DEE2 can be found at http://dee2.io.
format Online
Article
Text
id pubmed-6446219
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-64462192019-04-09 Digital expression explorer 2: a repository of uniformly processed RNA sequencing data Ziemann, Mark Kaspi, Antony El-Osta, Assam Gigascience Data Note BACKGROUND: RNA sequencing (RNA-seq) is an indispensable tool in the study of gene regulation. While the technology has brought with it better transcript coverage and quantification, there remain considerable barriers to entry for the computational biologist to analyse large data sets. There is a real need for a repository of uniformly processed RNA-seq data that is easy to use. FINDINGS: To address these obstacles, we developed Digital Expression Explorer 2 (DEE2), a web-based repository of RNA-seq data in the form of gene-level and transcript-level expression counts. DEE2 contains >5.3 trillion assigned reads from 580,000 RNA-seq data sets including species Escherichia coli, yeast, Arabidopsis, worm, fruit fly, zebrafish, rat, mouse, and human. Base-space sequence data downloaded from the National Center for Biotechnology Information Sequence Read Archive underwent quality control prior to transcriptome and genome mapping using open-source tools. Uniform data processing methods ensure consistency across experiments, facilitating fast and reproducible meta-analyses. CONCLUSIONS: The web interface allows users to quickly identify data sets of interest using accession number and keyword searches. The data can also be accessed programmatically using a specifically designed R package. We demonstrate that DEE2 data are compatible with statistical packages such as edgeR or DESeq. Bulk data are also available for download. DEE2 can be found at http://dee2.io. Oxford University Press 2019-04-03 /pmc/articles/PMC6446219/ /pubmed/30942868 http://dx.doi.org/10.1093/gigascience/giz022 Text en © The Author(s) 2019. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Data Note
Ziemann, Mark
Kaspi, Antony
El-Osta, Assam
Digital expression explorer 2: a repository of uniformly processed RNA sequencing data
title Digital expression explorer 2: a repository of uniformly processed RNA sequencing data
title_full Digital expression explorer 2: a repository of uniformly processed RNA sequencing data
title_fullStr Digital expression explorer 2: a repository of uniformly processed RNA sequencing data
title_full_unstemmed Digital expression explorer 2: a repository of uniformly processed RNA sequencing data
title_short Digital expression explorer 2: a repository of uniformly processed RNA sequencing data
title_sort digital expression explorer 2: a repository of uniformly processed rna sequencing data
topic Data Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6446219/
https://www.ncbi.nlm.nih.gov/pubmed/30942868
http://dx.doi.org/10.1093/gigascience/giz022
work_keys_str_mv AT ziemannmark digitalexpressionexplorer2arepositoryofuniformlyprocessedrnasequencingdata
AT kaspiantony digitalexpressionexplorer2arepositoryofuniformlyprocessedrnasequencingdata
AT elostaassam digitalexpressionexplorer2arepositoryofuniformlyprocessedrnasequencingdata