Cargando…

Large-scale untargeted LC-MS metabolomics data correction using between-batch feature alignment and cluster-based within-batch signal intensity drift correction

INTRODUCTION: Liquid chromatography-mass spectrometry (LC-MS) is a commonly used technique in untargeted metabolomics owing to broad coverage of metabolites, high sensitivity and simple sample preparation. However, data generated from multiple batches are affected by measurement errors inherent to a...

Descripción completa

Detalles Bibliográficos
Autores principales:	Brunius, Carl, Shi, Lin, Landberg, Rikard
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Springer US 2016
Materias:	Original Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5031781/ https://www.ncbi.nlm.nih.gov/pubmed/27746707 http://dx.doi.org/10.1007/s11306-016-1124-4

_version_	1782454857117990912
author	Brunius, Carl Shi, Lin Landberg, Rikard
author_facet	Brunius, Carl Shi, Lin Landberg, Rikard
author_sort	Brunius, Carl
collection	PubMed
description	INTRODUCTION: Liquid chromatography-mass spectrometry (LC-MS) is a commonly used technique in untargeted metabolomics owing to broad coverage of metabolites, high sensitivity and simple sample preparation. However, data generated from multiple batches are affected by measurement errors inherent to alterations in signal intensity, drift in mass accuracy and retention times between samples both within and between batches. These measurement errors reduce repeatability and reproducibility and may thus decrease the power to detect biological responses and obscure interpretation. OBJECTIVE: Our aim was to develop procedures to address and correct for within- and between-batch variability in processing multiple-batch untargeted LC-MS metabolomics data to increase their quality. METHODS: Algorithms were developed for: (i) alignment and merging of features that are systematically misaligned between batches, through aggregating feature presence/missingness on batch level and combining similar features orthogonally present between batches; and (ii) within-batch drift correction using a cluster-based approach that allows multiple drift patterns within batch. Furthermore, a heuristic criterion was developed for the feature-wise choice of reference-based or population-based between-batch normalisation. RESULTS: In authentic data, between-batch alignment resulted in picking 15 % more features and deconvoluting 15 % of features previously erroneously aligned. Within-batch correction provided a decrease in median quality control feature coefficient of variation from 20.5 to 15.1 %. Algorithms are open source and available as an R package (‘batchCorr’). CONCLUSIONS: The developed procedures provide unbiased measures of improved data quality, with implications for improved data analysis. Although developed for LC-MS based metabolomics, these methods are generic and can be applied to other data suffering from similar limitations. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s11306-016-1124-4) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-5031781
institution	National Center for Biotechnology Information
language	English
publishDate	2016
publisher	Springer US
record_format	MEDLINE/PubMed
spelling	pubmed-50317812016-10-14 Large-scale untargeted LC-MS metabolomics data correction using between-batch feature alignment and cluster-based within-batch signal intensity drift correction Brunius, Carl Shi, Lin Landberg, Rikard Metabolomics Original Article INTRODUCTION: Liquid chromatography-mass spectrometry (LC-MS) is a commonly used technique in untargeted metabolomics owing to broad coverage of metabolites, high sensitivity and simple sample preparation. However, data generated from multiple batches are affected by measurement errors inherent to alterations in signal intensity, drift in mass accuracy and retention times between samples both within and between batches. These measurement errors reduce repeatability and reproducibility and may thus decrease the power to detect biological responses and obscure interpretation. OBJECTIVE: Our aim was to develop procedures to address and correct for within- and between-batch variability in processing multiple-batch untargeted LC-MS metabolomics data to increase their quality. METHODS: Algorithms were developed for: (i) alignment and merging of features that are systematically misaligned between batches, through aggregating feature presence/missingness on batch level and combining similar features orthogonally present between batches; and (ii) within-batch drift correction using a cluster-based approach that allows multiple drift patterns within batch. Furthermore, a heuristic criterion was developed for the feature-wise choice of reference-based or population-based between-batch normalisation. RESULTS: In authentic data, between-batch alignment resulted in picking 15 % more features and deconvoluting 15 % of features previously erroneously aligned. Within-batch correction provided a decrease in median quality control feature coefficient of variation from 20.5 to 15.1 %. Algorithms are open source and available as an R package (‘batchCorr’). CONCLUSIONS: The developed procedures provide unbiased measures of improved data quality, with implications for improved data analysis. Although developed for LC-MS based metabolomics, these methods are generic and can be applied to other data suffering from similar limitations. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s11306-016-1124-4) contains supplementary material, which is available to authorized users. Springer US 2016-09-22 2016 /pmc/articles/PMC5031781/ /pubmed/27746707 http://dx.doi.org/10.1007/s11306-016-1124-4 Text en © The Author(s) 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
spellingShingle	Original Article Brunius, Carl Shi, Lin Landberg, Rikard Large-scale untargeted LC-MS metabolomics data correction using between-batch feature alignment and cluster-based within-batch signal intensity drift correction
title	Large-scale untargeted LC-MS metabolomics data correction using between-batch feature alignment and cluster-based within-batch signal intensity drift correction
title_full	Large-scale untargeted LC-MS metabolomics data correction using between-batch feature alignment and cluster-based within-batch signal intensity drift correction
title_fullStr	Large-scale untargeted LC-MS metabolomics data correction using between-batch feature alignment and cluster-based within-batch signal intensity drift correction
title_full_unstemmed	Large-scale untargeted LC-MS metabolomics data correction using between-batch feature alignment and cluster-based within-batch signal intensity drift correction
title_short	Large-scale untargeted LC-MS metabolomics data correction using between-batch feature alignment and cluster-based within-batch signal intensity drift correction
title_sort	large-scale untargeted lc-ms metabolomics data correction using between-batch feature alignment and cluster-based within-batch signal intensity drift correction
topic	Original Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5031781/ https://www.ncbi.nlm.nih.gov/pubmed/27746707 http://dx.doi.org/10.1007/s11306-016-1124-4
work_keys_str_mv	AT bruniuscarl largescaleuntargetedlcmsmetabolomicsdatacorrectionusingbetweenbatchfeaturealignmentandclusterbasedwithinbatchsignalintensitydriftcorrection AT shilin largescaleuntargetedlcmsmetabolomicsdatacorrectionusingbetweenbatchfeaturealignmentandclusterbasedwithinbatchsignalintensitydriftcorrection AT landbergrikard largescaleuntargetedlcmsmetabolomicsdatacorrectionusingbetweenbatchfeaturealignmentandclusterbasedwithinbatchsignalintensitydriftcorrection

Large-scale untargeted LC-MS metabolomics data correction using between-batch feature alignment and cluster-based within-batch signal intensity drift correction

Ejemplares similares