Cargando…

A Modular and Expandable Ecosystem for Metabolomics Data Annotation in R

Liquid chromatography-mass spectrometry (LC-MS)-based untargeted metabolomics experiments have become increasingly popular because of the wide range of metabolites that can be analyzed and the possibility to measure novel compounds. LC-MS instrumentation and analysis conditions can differ substantia...

Descripción completa

Detalles Bibliográficos
Autores principales: Rainer, Johannes, Vicini, Andrea, Salzer, Liesa, Stanstrup, Jan, Badia, Josep M., Neumann, Steffen, Stravs, Michael A., Verri Hernandes, Vinicius, Gatto, Laurent, Gibb, Sebastian, Witting, Michael
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8878271/
https://www.ncbi.nlm.nih.gov/pubmed/35208247
http://dx.doi.org/10.3390/metabo12020173
_version_ 1784658621716496384
author Rainer, Johannes
Vicini, Andrea
Salzer, Liesa
Stanstrup, Jan
Badia, Josep M.
Neumann, Steffen
Stravs, Michael A.
Verri Hernandes, Vinicius
Gatto, Laurent
Gibb, Sebastian
Witting, Michael
author_facet Rainer, Johannes
Vicini, Andrea
Salzer, Liesa
Stanstrup, Jan
Badia, Josep M.
Neumann, Steffen
Stravs, Michael A.
Verri Hernandes, Vinicius
Gatto, Laurent
Gibb, Sebastian
Witting, Michael
author_sort Rainer, Johannes
collection PubMed
description Liquid chromatography-mass spectrometry (LC-MS)-based untargeted metabolomics experiments have become increasingly popular because of the wide range of metabolites that can be analyzed and the possibility to measure novel compounds. LC-MS instrumentation and analysis conditions can differ substantially among laboratories and experiments, thus resulting in non-standardized datasets demanding customized annotation workflows. We present an ecosystem of R packages, centered around the MetaboCoreUtils, MetaboAnnotation and CompoundDb packages that together provide a modular infrastructure for the annotation of untargeted metabolomics data. Initial annotation can be performed based on MS(1) properties such as m/z and retention times, followed by an MS(2)-based annotation in which experimental fragment spectra are compared against a reference library. Such reference databases can be created and managed with the CompoundDb package. The ecosystem supports data from a variety of formats, including, but not limited to, MSP, MGF, mzML, mzXML, netCDF as well as MassBank text files and SQL databases. Through its highly customizable functionality, the presented infrastructure allows to build reproducible annotation workflows tailored for and adapted to most untargeted LC-MS-based datasets. All core functionality, which supports base R data types, is exported, also facilitating its re-use in other R packages. Finally, all packages are thoroughly unit-tested and documented and are available on GitHub and through Bioconductor.
format Online
Article
Text
id pubmed-8878271
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-88782712022-02-26 A Modular and Expandable Ecosystem for Metabolomics Data Annotation in R Rainer, Johannes Vicini, Andrea Salzer, Liesa Stanstrup, Jan Badia, Josep M. Neumann, Steffen Stravs, Michael A. Verri Hernandes, Vinicius Gatto, Laurent Gibb, Sebastian Witting, Michael Metabolites Article Liquid chromatography-mass spectrometry (LC-MS)-based untargeted metabolomics experiments have become increasingly popular because of the wide range of metabolites that can be analyzed and the possibility to measure novel compounds. LC-MS instrumentation and analysis conditions can differ substantially among laboratories and experiments, thus resulting in non-standardized datasets demanding customized annotation workflows. We present an ecosystem of R packages, centered around the MetaboCoreUtils, MetaboAnnotation and CompoundDb packages that together provide a modular infrastructure for the annotation of untargeted metabolomics data. Initial annotation can be performed based on MS(1) properties such as m/z and retention times, followed by an MS(2)-based annotation in which experimental fragment spectra are compared against a reference library. Such reference databases can be created and managed with the CompoundDb package. The ecosystem supports data from a variety of formats, including, but not limited to, MSP, MGF, mzML, mzXML, netCDF as well as MassBank text files and SQL databases. Through its highly customizable functionality, the presented infrastructure allows to build reproducible annotation workflows tailored for and adapted to most untargeted LC-MS-based datasets. All core functionality, which supports base R data types, is exported, also facilitating its re-use in other R packages. Finally, all packages are thoroughly unit-tested and documented and are available on GitHub and through Bioconductor. MDPI 2022-02-11 /pmc/articles/PMC8878271/ /pubmed/35208247 http://dx.doi.org/10.3390/metabo12020173 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Rainer, Johannes
Vicini, Andrea
Salzer, Liesa
Stanstrup, Jan
Badia, Josep M.
Neumann, Steffen
Stravs, Michael A.
Verri Hernandes, Vinicius
Gatto, Laurent
Gibb, Sebastian
Witting, Michael
A Modular and Expandable Ecosystem for Metabolomics Data Annotation in R
title A Modular and Expandable Ecosystem for Metabolomics Data Annotation in R
title_full A Modular and Expandable Ecosystem for Metabolomics Data Annotation in R
title_fullStr A Modular and Expandable Ecosystem for Metabolomics Data Annotation in R
title_full_unstemmed A Modular and Expandable Ecosystem for Metabolomics Data Annotation in R
title_short A Modular and Expandable Ecosystem for Metabolomics Data Annotation in R
title_sort modular and expandable ecosystem for metabolomics data annotation in r
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8878271/
https://www.ncbi.nlm.nih.gov/pubmed/35208247
http://dx.doi.org/10.3390/metabo12020173
work_keys_str_mv AT rainerjohannes amodularandexpandableecosystemformetabolomicsdataannotationinr
AT viciniandrea amodularandexpandableecosystemformetabolomicsdataannotationinr
AT salzerliesa amodularandexpandableecosystemformetabolomicsdataannotationinr
AT stanstrupjan amodularandexpandableecosystemformetabolomicsdataannotationinr
AT badiajosepm amodularandexpandableecosystemformetabolomicsdataannotationinr
AT neumannsteffen amodularandexpandableecosystemformetabolomicsdataannotationinr
AT stravsmichaela amodularandexpandableecosystemformetabolomicsdataannotationinr
AT verrihernandesvinicius amodularandexpandableecosystemformetabolomicsdataannotationinr
AT gattolaurent amodularandexpandableecosystemformetabolomicsdataannotationinr
AT gibbsebastian amodularandexpandableecosystemformetabolomicsdataannotationinr
AT wittingmichael amodularandexpandableecosystemformetabolomicsdataannotationinr
AT rainerjohannes modularandexpandableecosystemformetabolomicsdataannotationinr
AT viciniandrea modularandexpandableecosystemformetabolomicsdataannotationinr
AT salzerliesa modularandexpandableecosystemformetabolomicsdataannotationinr
AT stanstrupjan modularandexpandableecosystemformetabolomicsdataannotationinr
AT badiajosepm modularandexpandableecosystemformetabolomicsdataannotationinr
AT neumannsteffen modularandexpandableecosystemformetabolomicsdataannotationinr
AT stravsmichaela modularandexpandableecosystemformetabolomicsdataannotationinr
AT verrihernandesvinicius modularandexpandableecosystemformetabolomicsdataannotationinr
AT gattolaurent modularandexpandableecosystemformetabolomicsdataannotationinr
AT gibbsebastian modularandexpandableecosystemformetabolomicsdataannotationinr
AT wittingmichael modularandexpandableecosystemformetabolomicsdataannotationinr