Cargando…

Identifying and quantifying metabolites by scoring peaks of GC-MS data

BACKGROUND: Metabolomics is one of most recent omics technologies. It has been applied on fields such as food science, nutrition, drug discovery and systems biology. For this, gas chromatography-mass spectrometry (GC-MS) has been largely applied and many computational tools have been developed to su...

Descripción completa

Detalles Bibliográficos
Autores principales: Aggio, Raphael BM, Mayor, Arno, Reade, Sophie, Probert, Chris SJ, Ruggiero, Katya
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4307155/
https://www.ncbi.nlm.nih.gov/pubmed/25492550
http://dx.doi.org/10.1186/s12859-014-0374-2
_version_ 1782354411383685120
author Aggio, Raphael BM
Mayor, Arno
Reade, Sophie
Probert, Chris SJ
Ruggiero, Katya
author_facet Aggio, Raphael BM
Mayor, Arno
Reade, Sophie
Probert, Chris SJ
Ruggiero, Katya
author_sort Aggio, Raphael BM
collection PubMed
description BACKGROUND: Metabolomics is one of most recent omics technologies. It has been applied on fields such as food science, nutrition, drug discovery and systems biology. For this, gas chromatography-mass spectrometry (GC-MS) has been largely applied and many computational tools have been developed to support the analysis of metabolomics data. Among them, AMDIS is perhaps the most used tool for identifying and quantifying metabolites. However, AMDIS generates a high number of false-positives and does not have an interface amenable for high-throughput data analysis. Although additional computational tools have been developed for processing AMDIS results and to perform normalisations and statistical analysis of metabolomics data, there is not yet a single free software or package able to reliably identify and quantify metabolites analysed by GC-MS. RESULTS: Here we introduce a new algorithm, PScore, able to score peaks according to their likelihood of representing metabolites defined in a mass spectral library. We implemented PScore in a R package called MetaBox and evaluated the applicability and potential of MetaBox by comparing its performance against AMDIS results when analysing volatile organic compounds (VOC) from standard mixtures of metabolites and from female and male mice faecal samples. MetaBox reported lower percentages of false positives and false negatives, and was able to report a higher number of potential biomarkers associated to the metabolism of female and male mice. CONCLUSIONS: Identification and quantification of metabolites is among the most critical and time-consuming steps in GC-MS metabolome analysis. Here we present an algorithm implemented in a R package, which allows users to construct flexible pipelines and analyse metabolomics data in a high-throughput manner. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-014-0374-2) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4307155
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-43071552015-02-03 Identifying and quantifying metabolites by scoring peaks of GC-MS data Aggio, Raphael BM Mayor, Arno Reade, Sophie Probert, Chris SJ Ruggiero, Katya BMC Bioinformatics Software BACKGROUND: Metabolomics is one of most recent omics technologies. It has been applied on fields such as food science, nutrition, drug discovery and systems biology. For this, gas chromatography-mass spectrometry (GC-MS) has been largely applied and many computational tools have been developed to support the analysis of metabolomics data. Among them, AMDIS is perhaps the most used tool for identifying and quantifying metabolites. However, AMDIS generates a high number of false-positives and does not have an interface amenable for high-throughput data analysis. Although additional computational tools have been developed for processing AMDIS results and to perform normalisations and statistical analysis of metabolomics data, there is not yet a single free software or package able to reliably identify and quantify metabolites analysed by GC-MS. RESULTS: Here we introduce a new algorithm, PScore, able to score peaks according to their likelihood of representing metabolites defined in a mass spectral library. We implemented PScore in a R package called MetaBox and evaluated the applicability and potential of MetaBox by comparing its performance against AMDIS results when analysing volatile organic compounds (VOC) from standard mixtures of metabolites and from female and male mice faecal samples. MetaBox reported lower percentages of false positives and false negatives, and was able to report a higher number of potential biomarkers associated to the metabolism of female and male mice. CONCLUSIONS: Identification and quantification of metabolites is among the most critical and time-consuming steps in GC-MS metabolome analysis. Here we present an algorithm implemented in a R package, which allows users to construct flexible pipelines and analyse metabolomics data in a high-throughput manner. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-014-0374-2) contains supplementary material, which is available to authorized users. BioMed Central 2014-12-10 /pmc/articles/PMC4307155/ /pubmed/25492550 http://dx.doi.org/10.1186/s12859-014-0374-2 Text en © Aggio et al.; licensee BioMed Central Ltd. 2014 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Aggio, Raphael BM
Mayor, Arno
Reade, Sophie
Probert, Chris SJ
Ruggiero, Katya
Identifying and quantifying metabolites by scoring peaks of GC-MS data
title Identifying and quantifying metabolites by scoring peaks of GC-MS data
title_full Identifying and quantifying metabolites by scoring peaks of GC-MS data
title_fullStr Identifying and quantifying metabolites by scoring peaks of GC-MS data
title_full_unstemmed Identifying and quantifying metabolites by scoring peaks of GC-MS data
title_short Identifying and quantifying metabolites by scoring peaks of GC-MS data
title_sort identifying and quantifying metabolites by scoring peaks of gc-ms data
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4307155/
https://www.ncbi.nlm.nih.gov/pubmed/25492550
http://dx.doi.org/10.1186/s12859-014-0374-2
work_keys_str_mv AT aggioraphaelbm identifyingandquantifyingmetabolitesbyscoringpeaksofgcmsdata
AT mayorarno identifyingandquantifyingmetabolitesbyscoringpeaksofgcmsdata
AT readesophie identifyingandquantifyingmetabolitesbyscoringpeaksofgcmsdata
AT probertchrissj identifyingandquantifyingmetabolitesbyscoringpeaksofgcmsdata
AT ruggierokatya identifyingandquantifyingmetabolitesbyscoringpeaksofgcmsdata