Cargando…

A Python-Based Pipeline for Preprocessing LC–MS Data for Untargeted Metabolomics Workflows

Preprocessing data in a reproducible and robust way is one of the current challenges in untargeted metabolomics workflows. Data curation in liquid chromatography–mass spectrometry (LC–MS) involves the removal of biologically non-relevant features (retention time, m/z pairs) to retain only high-quali...

Descripción completa

Detalles Bibliográficos
Autores principales: Riquelme, Gabriel, Zabalegui, Nicolás, Marchi, Pablo, Jones, Christina M., Monge, María Eugenia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7602939/
https://www.ncbi.nlm.nih.gov/pubmed/33081373
http://dx.doi.org/10.3390/metabo10100416
_version_ 1783603802814808064
author Riquelme, Gabriel
Zabalegui, Nicolás
Marchi, Pablo
Jones, Christina M.
Monge, María Eugenia
author_facet Riquelme, Gabriel
Zabalegui, Nicolás
Marchi, Pablo
Jones, Christina M.
Monge, María Eugenia
author_sort Riquelme, Gabriel
collection PubMed
description Preprocessing data in a reproducible and robust way is one of the current challenges in untargeted metabolomics workflows. Data curation in liquid chromatography–mass spectrometry (LC–MS) involves the removal of biologically non-relevant features (retention time, m/z pairs) to retain only high-quality data for subsequent analysis and interpretation. The present work introduces TidyMS, a package for the Python programming language for preprocessing LC–MS data for quality control (QC) procedures in untargeted metabolomics workflows. It is a versatile strategy that can be customized or fit for purpose according to the specific metabolomics application. It allows performing quality control procedures to ensure accuracy and reliability in LC–MS measurements, and it allows preprocessing metabolomics data to obtain cleaned matrices for subsequent statistical analysis. The capabilities of the package are shown with pipelines for an LC–MS system suitability check, system conditioning, signal drift evaluation, and data curation. These applications were implemented to preprocess data corresponding to a new suite of candidate plasma reference materials developed by the National Institute of Standards and Technology (NIST; hypertriglyceridemic, diabetic, and African-American plasma pools) to be used in untargeted metabolomics studies in addition to NIST SRM 1950 Metabolites in Frozen Human Plasma. The package offers a rapid and reproducible workflow that can be used in an automated or semi-automated fashion, and it is an open and free tool available to all users.
format Online
Article
Text
id pubmed-7602939
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-76029392020-11-01 A Python-Based Pipeline for Preprocessing LC–MS Data for Untargeted Metabolomics Workflows Riquelme, Gabriel Zabalegui, Nicolás Marchi, Pablo Jones, Christina M. Monge, María Eugenia Metabolites Article Preprocessing data in a reproducible and robust way is one of the current challenges in untargeted metabolomics workflows. Data curation in liquid chromatography–mass spectrometry (LC–MS) involves the removal of biologically non-relevant features (retention time, m/z pairs) to retain only high-quality data for subsequent analysis and interpretation. The present work introduces TidyMS, a package for the Python programming language for preprocessing LC–MS data for quality control (QC) procedures in untargeted metabolomics workflows. It is a versatile strategy that can be customized or fit for purpose according to the specific metabolomics application. It allows performing quality control procedures to ensure accuracy and reliability in LC–MS measurements, and it allows preprocessing metabolomics data to obtain cleaned matrices for subsequent statistical analysis. The capabilities of the package are shown with pipelines for an LC–MS system suitability check, system conditioning, signal drift evaluation, and data curation. These applications were implemented to preprocess data corresponding to a new suite of candidate plasma reference materials developed by the National Institute of Standards and Technology (NIST; hypertriglyceridemic, diabetic, and African-American plasma pools) to be used in untargeted metabolomics studies in addition to NIST SRM 1950 Metabolites in Frozen Human Plasma. The package offers a rapid and reproducible workflow that can be used in an automated or semi-automated fashion, and it is an open and free tool available to all users. MDPI 2020-10-16 /pmc/articles/PMC7602939/ /pubmed/33081373 http://dx.doi.org/10.3390/metabo10100416 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Riquelme, Gabriel
Zabalegui, Nicolás
Marchi, Pablo
Jones, Christina M.
Monge, María Eugenia
A Python-Based Pipeline for Preprocessing LC–MS Data for Untargeted Metabolomics Workflows
title A Python-Based Pipeline for Preprocessing LC–MS Data for Untargeted Metabolomics Workflows
title_full A Python-Based Pipeline for Preprocessing LC–MS Data for Untargeted Metabolomics Workflows
title_fullStr A Python-Based Pipeline for Preprocessing LC–MS Data for Untargeted Metabolomics Workflows
title_full_unstemmed A Python-Based Pipeline for Preprocessing LC–MS Data for Untargeted Metabolomics Workflows
title_short A Python-Based Pipeline for Preprocessing LC–MS Data for Untargeted Metabolomics Workflows
title_sort python-based pipeline for preprocessing lc–ms data for untargeted metabolomics workflows
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7602939/
https://www.ncbi.nlm.nih.gov/pubmed/33081373
http://dx.doi.org/10.3390/metabo10100416
work_keys_str_mv AT riquelmegabriel apythonbasedpipelineforpreprocessinglcmsdataforuntargetedmetabolomicsworkflows
AT zabaleguinicolas apythonbasedpipelineforpreprocessinglcmsdataforuntargetedmetabolomicsworkflows
AT marchipablo apythonbasedpipelineforpreprocessinglcmsdataforuntargetedmetabolomicsworkflows
AT joneschristinam apythonbasedpipelineforpreprocessinglcmsdataforuntargetedmetabolomicsworkflows
AT mongemariaeugenia apythonbasedpipelineforpreprocessinglcmsdataforuntargetedmetabolomicsworkflows
AT riquelmegabriel pythonbasedpipelineforpreprocessinglcmsdataforuntargetedmetabolomicsworkflows
AT zabaleguinicolas pythonbasedpipelineforpreprocessinglcmsdataforuntargetedmetabolomicsworkflows
AT marchipablo pythonbasedpipelineforpreprocessinglcmsdataforuntargetedmetabolomicsworkflows
AT joneschristinam pythonbasedpipelineforpreprocessinglcmsdataforuntargetedmetabolomicsworkflows
AT mongemariaeugenia pythonbasedpipelineforpreprocessinglcmsdataforuntargetedmetabolomicsworkflows