Cargando…

Galaxy-M: a Galaxy workflow for processing and analyzing direct infusion and liquid chromatography mass spectrometry-based metabolomics data

BACKGROUND: Metabolomics is increasingly recognized as an invaluable tool in the biological, medical and environmental sciences yet lags behind the methodological maturity of other omics fields. To achieve its full potential, including the integration of multiple omics modalities, the accessibility,...

Descripción completa

Detalles Bibliográficos
Autores principales: Davidson, Robert L., Weber, Ralf J. M., Liu, Haoyu, Sharma-Oates, Archana, Viant, Mark R.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4765054/
https://www.ncbi.nlm.nih.gov/pubmed/26913198
http://dx.doi.org/10.1186/s13742-016-0115-8
Descripción
Sumario:BACKGROUND: Metabolomics is increasingly recognized as an invaluable tool in the biological, medical and environmental sciences yet lags behind the methodological maturity of other omics fields. To achieve its full potential, including the integration of multiple omics modalities, the accessibility, standardization and reproducibility of computational metabolomics tools must be improved significantly. RESULTS: Here we present our end-to-end mass spectrometry metabolomics workflow in the widely used platform, Galaxy. Named Galaxy-M, our workflow has been developed for both direct infusion mass spectrometry (DIMS) and liquid chromatography mass spectrometry (LC-MS) metabolomics. The range of tools presented spans from processing of raw data, e.g. peak picking and alignment, through data cleansing, e.g. missing value imputation, to preparation for statistical analysis, e.g. normalization and scaling, and principal components analysis (PCA) with associated statistical evaluation. We demonstrate the ease of using these Galaxy workflows via the analysis of DIMS and LC-MS datasets, and provide PCA scores and associated statistics to help other users to ensure that they can accurately repeat the processing and analysis of these two datasets. Galaxy and data are all provided pre-installed in a virtual machine (VM) that can be downloaded from the GigaDB repository. Additionally, source code, executables and installation instructions are available from GitHub. CONCLUSIONS: The Galaxy platform has enabled us to produce an easily accessible and reproducible computational metabolomics workflow. More tools could be added by the community to expand its functionality. We recommend that Galaxy-M workflow files are included within the supplementary information of publications, enabling metabolomics studies to achieve greater reproducibility. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13742-016-0115-8) contains supplementary material, which is available to authorized users.