Cargando…

Consolidating metabolite identifiers to enable contextual and multi-platform metabolomics data analysis

BACKGROUND: Analysis of data from high-throughput experiments depends on the availability of well-structured data that describe the assayed biomolecules. Procedures for obtaining and organizing such meta-data on genes, transcripts and proteins have been streamlined in many data analysis packages, bu...

Descripción completa

Detalles Bibliográficos
Autores principales:	Redestig, Henning, Kusano, Miyako, Fukushima, Atsushi, Matsuda, Fumio, Saito, Kazuki, Arita, Masanori
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2010
Materias:	Software
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2879285/ https://www.ncbi.nlm.nih.gov/pubmed/20426876 http://dx.doi.org/10.1186/1471-2105-11-214

_version_	1782181913452085248
author	Redestig, Henning Kusano, Miyako Fukushima, Atsushi Matsuda, Fumio Saito, Kazuki Arita, Masanori
author_facet	Redestig, Henning Kusano, Miyako Fukushima, Atsushi Matsuda, Fumio Saito, Kazuki Arita, Masanori
author_sort	Redestig, Henning
collection	PubMed
description	BACKGROUND: Analysis of data from high-throughput experiments depends on the availability of well-structured data that describe the assayed biomolecules. Procedures for obtaining and organizing such meta-data on genes, transcripts and proteins have been streamlined in many data analysis packages, but are still lacking for metabolites. Chemical identifiers are notoriously incoherent, encompassing a wide range of different referencing schemes with varying scope and coverage. Online chemical databases use multiple types of identifiers in parallel but lack a common primary key for reliable database consolidation. Connecting identifiers of analytes found in experimental data with the identifiers of their parent metabolites in public databases can therefore be very laborious. RESULTS: Here we present a strategy and a software tool for integrating metabolite identifiers from local reference libraries and public databases that do not depend on a single common primary identifier. The program constructs groups of interconnected identifiers of analytes and metabolites to obtain a local metabolite-centric SQLite database. The created database can be used to map in-house identifiers and synonyms to external resources such as the KEGG database. New identifiers can be imported and directly integrated with existing data. Queries can be performed in a flexible way, both from the command line and from the statistical programming environment R, to obtain data set tailored identifier mappings. CONCLUSIONS: Efficient cross-referencing of metabolite identifiers is a key technology for metabolomics data analysis. We provide a practical and flexible solution to this task and an open-source program, the metabolite masking tool (MetMask), available at http://metmask.sourceforge.net, that implements our ideas.
format	Text
id	pubmed-2879285
institution	National Center for Biotechnology Information
language	English
publishDate	2010
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-28792852010-06-02 Consolidating metabolite identifiers to enable contextual and multi-platform metabolomics data analysis Redestig, Henning Kusano, Miyako Fukushima, Atsushi Matsuda, Fumio Saito, Kazuki Arita, Masanori BMC Bioinformatics Software BACKGROUND: Analysis of data from high-throughput experiments depends on the availability of well-structured data that describe the assayed biomolecules. Procedures for obtaining and organizing such meta-data on genes, transcripts and proteins have been streamlined in many data analysis packages, but are still lacking for metabolites. Chemical identifiers are notoriously incoherent, encompassing a wide range of different referencing schemes with varying scope and coverage. Online chemical databases use multiple types of identifiers in parallel but lack a common primary key for reliable database consolidation. Connecting identifiers of analytes found in experimental data with the identifiers of their parent metabolites in public databases can therefore be very laborious. RESULTS: Here we present a strategy and a software tool for integrating metabolite identifiers from local reference libraries and public databases that do not depend on a single common primary identifier. The program constructs groups of interconnected identifiers of analytes and metabolites to obtain a local metabolite-centric SQLite database. The created database can be used to map in-house identifiers and synonyms to external resources such as the KEGG database. New identifiers can be imported and directly integrated with existing data. Queries can be performed in a flexible way, both from the command line and from the statistical programming environment R, to obtain data set tailored identifier mappings. CONCLUSIONS: Efficient cross-referencing of metabolite identifiers is a key technology for metabolomics data analysis. We provide a practical and flexible solution to this task and an open-source program, the metabolite masking tool (MetMask), available at http://metmask.sourceforge.net, that implements our ideas. BioMed Central 2010-04-29 /pmc/articles/PMC2879285/ /pubmed/20426876 http://dx.doi.org/10.1186/1471-2105-11-214 Text en Copyright ©2010 Redestig et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Software Redestig, Henning Kusano, Miyako Fukushima, Atsushi Matsuda, Fumio Saito, Kazuki Arita, Masanori Consolidating metabolite identifiers to enable contextual and multi-platform metabolomics data analysis
title	Consolidating metabolite identifiers to enable contextual and multi-platform metabolomics data analysis
title_full	Consolidating metabolite identifiers to enable contextual and multi-platform metabolomics data analysis
title_fullStr	Consolidating metabolite identifiers to enable contextual and multi-platform metabolomics data analysis
title_full_unstemmed	Consolidating metabolite identifiers to enable contextual and multi-platform metabolomics data analysis
title_short	Consolidating metabolite identifiers to enable contextual and multi-platform metabolomics data analysis
title_sort	consolidating metabolite identifiers to enable contextual and multi-platform metabolomics data analysis
topic	Software
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2879285/ https://www.ncbi.nlm.nih.gov/pubmed/20426876 http://dx.doi.org/10.1186/1471-2105-11-214
work_keys_str_mv	AT redestighenning consolidatingmetaboliteidentifierstoenablecontextualandmultiplatformmetabolomicsdataanalysis AT kusanomiyako consolidatingmetaboliteidentifierstoenablecontextualandmultiplatformmetabolomicsdataanalysis AT fukushimaatsushi consolidatingmetaboliteidentifierstoenablecontextualandmultiplatformmetabolomicsdataanalysis AT matsudafumio consolidatingmetaboliteidentifierstoenablecontextualandmultiplatformmetabolomicsdataanalysis AT saitokazuki consolidatingmetaboliteidentifierstoenablecontextualandmultiplatformmetabolomicsdataanalysis AT aritamasanori consolidatingmetaboliteidentifierstoenablecontextualandmultiplatformmetabolomicsdataanalysis

Consolidating metabolite identifiers to enable contextual and multi-platform metabolomics data analysis

Ejemplares similares