Cargando…
Digital expression explorer 2: a repository of uniformly processed RNA sequencing data
BACKGROUND: RNA sequencing (RNA-seq) is an indispensable tool in the study of gene regulation. While the technology has brought with it better transcript coverage and quantification, there remain considerable barriers to entry for the computational biologist to analyse large data sets. There is a re...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6446219/ https://www.ncbi.nlm.nih.gov/pubmed/30942868 http://dx.doi.org/10.1093/gigascience/giz022 |
_version_ | 1783408321008500736 |
---|---|
author | Ziemann, Mark Kaspi, Antony El-Osta, Assam |
author_facet | Ziemann, Mark Kaspi, Antony El-Osta, Assam |
author_sort | Ziemann, Mark |
collection | PubMed |
description | BACKGROUND: RNA sequencing (RNA-seq) is an indispensable tool in the study of gene regulation. While the technology has brought with it better transcript coverage and quantification, there remain considerable barriers to entry for the computational biologist to analyse large data sets. There is a real need for a repository of uniformly processed RNA-seq data that is easy to use. FINDINGS: To address these obstacles, we developed Digital Expression Explorer 2 (DEE2), a web-based repository of RNA-seq data in the form of gene-level and transcript-level expression counts. DEE2 contains >5.3 trillion assigned reads from 580,000 RNA-seq data sets including species Escherichia coli, yeast, Arabidopsis, worm, fruit fly, zebrafish, rat, mouse, and human. Base-space sequence data downloaded from the National Center for Biotechnology Information Sequence Read Archive underwent quality control prior to transcriptome and genome mapping using open-source tools. Uniform data processing methods ensure consistency across experiments, facilitating fast and reproducible meta-analyses. CONCLUSIONS: The web interface allows users to quickly identify data sets of interest using accession number and keyword searches. The data can also be accessed programmatically using a specifically designed R package. We demonstrate that DEE2 data are compatible with statistical packages such as edgeR or DESeq. Bulk data are also available for download. DEE2 can be found at http://dee2.io. |
format | Online Article Text |
id | pubmed-6446219 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-64462192019-04-09 Digital expression explorer 2: a repository of uniformly processed RNA sequencing data Ziemann, Mark Kaspi, Antony El-Osta, Assam Gigascience Data Note BACKGROUND: RNA sequencing (RNA-seq) is an indispensable tool in the study of gene regulation. While the technology has brought with it better transcript coverage and quantification, there remain considerable barriers to entry for the computational biologist to analyse large data sets. There is a real need for a repository of uniformly processed RNA-seq data that is easy to use. FINDINGS: To address these obstacles, we developed Digital Expression Explorer 2 (DEE2), a web-based repository of RNA-seq data in the form of gene-level and transcript-level expression counts. DEE2 contains >5.3 trillion assigned reads from 580,000 RNA-seq data sets including species Escherichia coli, yeast, Arabidopsis, worm, fruit fly, zebrafish, rat, mouse, and human. Base-space sequence data downloaded from the National Center for Biotechnology Information Sequence Read Archive underwent quality control prior to transcriptome and genome mapping using open-source tools. Uniform data processing methods ensure consistency across experiments, facilitating fast and reproducible meta-analyses. CONCLUSIONS: The web interface allows users to quickly identify data sets of interest using accession number and keyword searches. The data can also be accessed programmatically using a specifically designed R package. We demonstrate that DEE2 data are compatible with statistical packages such as edgeR or DESeq. Bulk data are also available for download. DEE2 can be found at http://dee2.io. Oxford University Press 2019-04-03 /pmc/articles/PMC6446219/ /pubmed/30942868 http://dx.doi.org/10.1093/gigascience/giz022 Text en © The Author(s) 2019. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Data Note Ziemann, Mark Kaspi, Antony El-Osta, Assam Digital expression explorer 2: a repository of uniformly processed RNA sequencing data |
title | Digital expression explorer 2: a repository of uniformly processed RNA sequencing data |
title_full | Digital expression explorer 2: a repository of uniformly processed RNA sequencing data |
title_fullStr | Digital expression explorer 2: a repository of uniformly processed RNA sequencing data |
title_full_unstemmed | Digital expression explorer 2: a repository of uniformly processed RNA sequencing data |
title_short | Digital expression explorer 2: a repository of uniformly processed RNA sequencing data |
title_sort | digital expression explorer 2: a repository of uniformly processed rna sequencing data |
topic | Data Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6446219/ https://www.ncbi.nlm.nih.gov/pubmed/30942868 http://dx.doi.org/10.1093/gigascience/giz022 |
work_keys_str_mv | AT ziemannmark digitalexpressionexplorer2arepositoryofuniformlyprocessedrnasequencingdata AT kaspiantony digitalexpressionexplorer2arepositoryofuniformlyprocessedrnasequencingdata AT elostaassam digitalexpressionexplorer2arepositoryofuniformlyprocessedrnasequencingdata |