Cargando…

First step toward gene expression data integration: transcriptomic data acquisition with COMMAND>_

BACKGROUND: Exploring cellular responses to stimuli using extensive gene expression profiles has become a routine procedure performed on a daily basis. Raw and processed data from these studies are available on public databases but the opportunity to fully exploit such rich datasets is limited due t...

Descripción completa

Detalles Bibliográficos
Autores principales: Moretto, Marco, Sonego, Paolo, Villaseñor-Altamirano, Ana B., Engelen, Kristof
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6348648/
https://www.ncbi.nlm.nih.gov/pubmed/30691411
http://dx.doi.org/10.1186/s12859-019-2643-6
_version_ 1783390136871944192
author Moretto, Marco
Sonego, Paolo
Villaseñor-Altamirano, Ana B.
Engelen, Kristof
author_facet Moretto, Marco
Sonego, Paolo
Villaseñor-Altamirano, Ana B.
Engelen, Kristof
author_sort Moretto, Marco
collection PubMed
description BACKGROUND: Exploring cellular responses to stimuli using extensive gene expression profiles has become a routine procedure performed on a daily basis. Raw and processed data from these studies are available on public databases but the opportunity to fully exploit such rich datasets is limited due to the large heterogeneity of data formats. In recent years, several approaches have been proposed to effectively integrate gene expression data for analysis and exploration at a broader level. Despite the different goals and approaches towards gene expression data integration, the first step is common to any proposed method: data acquisition. Although it is seemingly straightforward to extract valuable information from a set of downloaded files, things can rapidly get complicated, especially as the number of experiments grows. Transcriptomic datasets are deposited in public databases with little regard to data format and thus retrieving raw data might become a challenging task. While for RNA-seq experiments such problem is partially mitigated by the fact that raw reads are generally available on databases such as the NCBI SRA, for microarray experiments standards are not equally well established, or enforced during submission, and thus a multitude of data formats has emerged. RESULTS: COMMAND>_ is a specialized tool meant to simplify gene expression data acquisition. It is a flexible multi-user web-application that allows users to search and download gene expression experiments, extract only the relevant information from experiment files, re-annotate microarray platforms, and present data in a simple and coherent data model for subsequent analysis. CONCLUSIONS: COMMAND>_ facilitates the creation of local datasets of gene expression data coming from both microarray and RNA-seq experiments and may be a more efficient tool to build integrated gene expression compendia. COMMAND>_ is free and open-source software, including publicly available tutorials and documentation.
format Online
Article
Text
id pubmed-6348648
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-63486482019-01-31 First step toward gene expression data integration: transcriptomic data acquisition with COMMAND>_ Moretto, Marco Sonego, Paolo Villaseñor-Altamirano, Ana B. Engelen, Kristof BMC Bioinformatics Software BACKGROUND: Exploring cellular responses to stimuli using extensive gene expression profiles has become a routine procedure performed on a daily basis. Raw and processed data from these studies are available on public databases but the opportunity to fully exploit such rich datasets is limited due to the large heterogeneity of data formats. In recent years, several approaches have been proposed to effectively integrate gene expression data for analysis and exploration at a broader level. Despite the different goals and approaches towards gene expression data integration, the first step is common to any proposed method: data acquisition. Although it is seemingly straightforward to extract valuable information from a set of downloaded files, things can rapidly get complicated, especially as the number of experiments grows. Transcriptomic datasets are deposited in public databases with little regard to data format and thus retrieving raw data might become a challenging task. While for RNA-seq experiments such problem is partially mitigated by the fact that raw reads are generally available on databases such as the NCBI SRA, for microarray experiments standards are not equally well established, or enforced during submission, and thus a multitude of data formats has emerged. RESULTS: COMMAND>_ is a specialized tool meant to simplify gene expression data acquisition. It is a flexible multi-user web-application that allows users to search and download gene expression experiments, extract only the relevant information from experiment files, re-annotate microarray platforms, and present data in a simple and coherent data model for subsequent analysis. CONCLUSIONS: COMMAND>_ facilitates the creation of local datasets of gene expression data coming from both microarray and RNA-seq experiments and may be a more efficient tool to build integrated gene expression compendia. COMMAND>_ is free and open-source software, including publicly available tutorials and documentation. BioMed Central 2019-01-28 /pmc/articles/PMC6348648/ /pubmed/30691411 http://dx.doi.org/10.1186/s12859-019-2643-6 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Moretto, Marco
Sonego, Paolo
Villaseñor-Altamirano, Ana B.
Engelen, Kristof
First step toward gene expression data integration: transcriptomic data acquisition with COMMAND>_
title First step toward gene expression data integration: transcriptomic data acquisition with COMMAND>_
title_full First step toward gene expression data integration: transcriptomic data acquisition with COMMAND>_
title_fullStr First step toward gene expression data integration: transcriptomic data acquisition with COMMAND>_
title_full_unstemmed First step toward gene expression data integration: transcriptomic data acquisition with COMMAND>_
title_short First step toward gene expression data integration: transcriptomic data acquisition with COMMAND>_
title_sort first step toward gene expression data integration: transcriptomic data acquisition with command>_
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6348648/
https://www.ncbi.nlm.nih.gov/pubmed/30691411
http://dx.doi.org/10.1186/s12859-019-2643-6
work_keys_str_mv AT morettomarco firststeptowardgeneexpressiondataintegrationtranscriptomicdataacquisitionwithcommand
AT sonegopaolo firststeptowardgeneexpressiondataintegrationtranscriptomicdataacquisitionwithcommand
AT villasenoraltamiranoanab firststeptowardgeneexpressiondataintegrationtranscriptomicdataacquisitionwithcommand
AT engelenkristof firststeptowardgeneexpressiondataintegrationtranscriptomicdataacquisitionwithcommand