Cargando…

TransAtlasDB: an integrated database connecting expression data, metadata and variants

High-throughput transcriptome sequencing (RNAseq) is the universally applied method for target-free transcript identification and gene expression quantification, generating huge amounts of data. The constraint of accessing such data and interpreting results can be a major impediment in postulating s...

Descripción completa

Detalles Bibliográficos
Autores principales: Adetunji, Modupeore O, Lamont, Susan J, Schmidt, Carl J
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5824778/
https://www.ncbi.nlm.nih.gov/pubmed/29688361
http://dx.doi.org/10.1093/database/bay014
_version_ 1783302079951929344
author Adetunji, Modupeore O
Lamont, Susan J
Schmidt, Carl J
author_facet Adetunji, Modupeore O
Lamont, Susan J
Schmidt, Carl J
author_sort Adetunji, Modupeore O
collection PubMed
description High-throughput transcriptome sequencing (RNAseq) is the universally applied method for target-free transcript identification and gene expression quantification, generating huge amounts of data. The constraint of accessing such data and interpreting results can be a major impediment in postulating suitable hypothesis, thus an innovative storage solution that addresses these limitations, such as hard disk storage requirements, efficiency and reproducibility are paramount. By offering a uniform data storage and retrieval mechanism, various data can be compared and easily investigated. We present a sophisticated system, TransAtlasDB, which incorporates a hybrid architecture of both relational and NoSQL databases for fast and efficient data storage, processing and querying of large datasets from transcript expression analysis with corresponding metadata, as well as gene-associated variants (such as SNPs) and their predicted gene effects. TransAtlasDB provides the data model of accurate storage of the large amount of data derived from RNAseq analysis and also methods of interacting with the database, either via the command-line data management workflows, written in Perl, with useful functionalities that simplifies the complexity of data storage and possibly manipulation of the massive amounts of data generated from RNAseq analysis or through the web interface. The database application is currently modeled to handle analyses data from agricultural species, and will be expanded to include more species groups. Overall TransAtlasDB aims to serve as an accessible repository for the large complex results data files derived from RNAseq gene expression profiling and variant analysis. Database URL: https://modupeore.github.io/TransAtlasDB/
format Online
Article
Text
id pubmed-5824778
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-58247782018-02-28 TransAtlasDB: an integrated database connecting expression data, metadata and variants Adetunji, Modupeore O Lamont, Susan J Schmidt, Carl J Database (Oxford) Original Article High-throughput transcriptome sequencing (RNAseq) is the universally applied method for target-free transcript identification and gene expression quantification, generating huge amounts of data. The constraint of accessing such data and interpreting results can be a major impediment in postulating suitable hypothesis, thus an innovative storage solution that addresses these limitations, such as hard disk storage requirements, efficiency and reproducibility are paramount. By offering a uniform data storage and retrieval mechanism, various data can be compared and easily investigated. We present a sophisticated system, TransAtlasDB, which incorporates a hybrid architecture of both relational and NoSQL databases for fast and efficient data storage, processing and querying of large datasets from transcript expression analysis with corresponding metadata, as well as gene-associated variants (such as SNPs) and their predicted gene effects. TransAtlasDB provides the data model of accurate storage of the large amount of data derived from RNAseq analysis and also methods of interacting with the database, either via the command-line data management workflows, written in Perl, with useful functionalities that simplifies the complexity of data storage and possibly manipulation of the massive amounts of data generated from RNAseq analysis or through the web interface. The database application is currently modeled to handle analyses data from agricultural species, and will be expanded to include more species groups. Overall TransAtlasDB aims to serve as an accessible repository for the large complex results data files derived from RNAseq gene expression profiling and variant analysis. Database URL: https://modupeore.github.io/TransAtlasDB/ Oxford University Press 2018-02-23 /pmc/articles/PMC5824778/ /pubmed/29688361 http://dx.doi.org/10.1093/database/bay014 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Article
Adetunji, Modupeore O
Lamont, Susan J
Schmidt, Carl J
TransAtlasDB: an integrated database connecting expression data, metadata and variants
title TransAtlasDB: an integrated database connecting expression data, metadata and variants
title_full TransAtlasDB: an integrated database connecting expression data, metadata and variants
title_fullStr TransAtlasDB: an integrated database connecting expression data, metadata and variants
title_full_unstemmed TransAtlasDB: an integrated database connecting expression data, metadata and variants
title_short TransAtlasDB: an integrated database connecting expression data, metadata and variants
title_sort transatlasdb: an integrated database connecting expression data, metadata and variants
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5824778/
https://www.ncbi.nlm.nih.gov/pubmed/29688361
http://dx.doi.org/10.1093/database/bay014
work_keys_str_mv AT adetunjimodupeoreo transatlasdbanintegrateddatabaseconnectingexpressiondatametadataandvariants
AT lamontsusanj transatlasdbanintegrateddatabaseconnectingexpressiondatametadataandvariants
AT schmidtcarlj transatlasdbanintegrateddatabaseconnectingexpressiondatametadataandvariants