Cargando…

A Python library for FAIRer access and deposition to the Metabolomics Workbench Data Repository

INTRODUCTION: The Metabolomics Workbench Data Repository is a public repository of mass spectrometry and nuclear magnetic resonance data and metadata derived from a wide variety of metabolomics studies. The data and metadata for each study is deposited, stored, and accessed via files in the domain-s...

Descripción completa

Detalles Bibliográficos
Autores principales: Smelter, Andrey, Moseley, Hunter N. B.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer US 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5910482/
https://www.ncbi.nlm.nih.gov/pubmed/29706851
http://dx.doi.org/10.1007/s11306-018-1356-6
_version_ 1783316056095326208
author Smelter, Andrey
Moseley, Hunter N. B.
author_facet Smelter, Andrey
Moseley, Hunter N. B.
author_sort Smelter, Andrey
collection PubMed
description INTRODUCTION: The Metabolomics Workbench Data Repository is a public repository of mass spectrometry and nuclear magnetic resonance data and metadata derived from a wide variety of metabolomics studies. The data and metadata for each study is deposited, stored, and accessed via files in the domain-specific ‘mwTab’ flat file format. OBJECTIVES: In order to improve the accessibility, reusability, and interoperability of the data and metadata stored in ‘mwTab’ formatted files, we implemented a Python library and package. This Python package, named ‘mwtab’, is a parser for the domain-specific ‘mwTab’ flat file format, which provides facilities for reading, accessing, and writing ‘mwTab’ formatted files. Furthermore, the package provides facilities to validate both the format and required metadata elements of a given ‘mwTab’ formatted file. METHODS: In order to develop the ‘mwtab’ package we used the official ‘mwTab’ format specification. We used Git version control along with Python unit-testing framework as well as continuous integration service to run those tests on multiple versions of Python. Package documentation was developed using sphinx documentation generator. RESULTS: The ‘mwtab’ package provides both Python programmatic library interfaces and command-line interfaces for reading, writing, and validating ‘mwTab’ formatted files. Data and associated metadata are stored within Python dictionary- and list-based data structures, enabling straightforward, ‘pythonic’ access and manipulation of data and metadata. Also, the package provides facilities to convert ‘mwTab’ files into a JSON formatted equivalent, enabling easy reusability of the data by all modern programming languages that implement JSON parsers. The ‘mwtab’ package implements its metadata validation functionality based on a pre-defined JSON schema that can be easily specialized for specific types of metabolomics studies. The library also provides a command-line interface for interconversion between ‘mwTab’ and JSONized formats in raw text and a variety of compressed binary file formats. CONCLUSIONS: The ‘mwtab’ package is an easy-to-use Python package that provides FAIRer utilization of the Metabolomics Workbench Data Repository. The source code is freely available on GitHub and via the Python Package Index. Documentation includes a ‘User Guide’, ‘Tutorial’, and ‘API Reference’. The GitHub repository also provides ‘mwtab’ package unit-tests via a continuous integration service. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1007/s11306-018-1356-6) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5910482
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Springer US
record_format MEDLINE/PubMed
spelling pubmed-59104822018-04-24 A Python library for FAIRer access and deposition to the Metabolomics Workbench Data Repository Smelter, Andrey Moseley, Hunter N. B. Metabolomics Software/Database INTRODUCTION: The Metabolomics Workbench Data Repository is a public repository of mass spectrometry and nuclear magnetic resonance data and metadata derived from a wide variety of metabolomics studies. The data and metadata for each study is deposited, stored, and accessed via files in the domain-specific ‘mwTab’ flat file format. OBJECTIVES: In order to improve the accessibility, reusability, and interoperability of the data and metadata stored in ‘mwTab’ formatted files, we implemented a Python library and package. This Python package, named ‘mwtab’, is a parser for the domain-specific ‘mwTab’ flat file format, which provides facilities for reading, accessing, and writing ‘mwTab’ formatted files. Furthermore, the package provides facilities to validate both the format and required metadata elements of a given ‘mwTab’ formatted file. METHODS: In order to develop the ‘mwtab’ package we used the official ‘mwTab’ format specification. We used Git version control along with Python unit-testing framework as well as continuous integration service to run those tests on multiple versions of Python. Package documentation was developed using sphinx documentation generator. RESULTS: The ‘mwtab’ package provides both Python programmatic library interfaces and command-line interfaces for reading, writing, and validating ‘mwTab’ formatted files. Data and associated metadata are stored within Python dictionary- and list-based data structures, enabling straightforward, ‘pythonic’ access and manipulation of data and metadata. Also, the package provides facilities to convert ‘mwTab’ files into a JSON formatted equivalent, enabling easy reusability of the data by all modern programming languages that implement JSON parsers. The ‘mwtab’ package implements its metadata validation functionality based on a pre-defined JSON schema that can be easily specialized for specific types of metabolomics studies. The library also provides a command-line interface for interconversion between ‘mwTab’ and JSONized formats in raw text and a variety of compressed binary file formats. CONCLUSIONS: The ‘mwtab’ package is an easy-to-use Python package that provides FAIRer utilization of the Metabolomics Workbench Data Repository. The source code is freely available on GitHub and via the Python Package Index. Documentation includes a ‘User Guide’, ‘Tutorial’, and ‘API Reference’. The GitHub repository also provides ‘mwtab’ package unit-tests via a continuous integration service. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1007/s11306-018-1356-6) contains supplementary material, which is available to authorized users. Springer US 2018-04-20 2018 /pmc/articles/PMC5910482/ /pubmed/29706851 http://dx.doi.org/10.1007/s11306-018-1356-6 Text en © The Author(s) 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
spellingShingle Software/Database
Smelter, Andrey
Moseley, Hunter N. B.
A Python library for FAIRer access and deposition to the Metabolomics Workbench Data Repository
title A Python library for FAIRer access and deposition to the Metabolomics Workbench Data Repository
title_full A Python library for FAIRer access and deposition to the Metabolomics Workbench Data Repository
title_fullStr A Python library for FAIRer access and deposition to the Metabolomics Workbench Data Repository
title_full_unstemmed A Python library for FAIRer access and deposition to the Metabolomics Workbench Data Repository
title_short A Python library for FAIRer access and deposition to the Metabolomics Workbench Data Repository
title_sort python library for fairer access and deposition to the metabolomics workbench data repository
topic Software/Database
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5910482/
https://www.ncbi.nlm.nih.gov/pubmed/29706851
http://dx.doi.org/10.1007/s11306-018-1356-6
work_keys_str_mv AT smelterandrey apythonlibraryforfaireraccessanddepositiontothemetabolomicsworkbenchdatarepository
AT moseleyhunternb apythonlibraryforfaireraccessanddepositiontothemetabolomicsworkbenchdatarepository
AT smelterandrey pythonlibraryforfaireraccessanddepositiontothemetabolomicsworkbenchdatarepository
AT moseleyhunternb pythonlibraryforfaireraccessanddepositiontothemetabolomicsworkbenchdatarepository