Cargando…

A data preprocessing strategy for metabolomics to reduce the mask effect in data analysis

Highlights: Developed a data preprocessing strategy to cope with missing values and mask effects in data analysis from high variation of abundant metabolites. A new method- ‘x-VAST’ was developed to amend the measurement deviation enlargement. Applying the above strategy, several low abundant masked...

Descripción completa

Detalles Bibliográficos
Autores principales:	Yang, Jun, Zhao, Xinjie, Lu, Xin, Lin, Xiaohui, Xu, Guowang
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2015
Materias:	Molecular Biosciences
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4428451/ https://www.ncbi.nlm.nih.gov/pubmed/25988172 http://dx.doi.org/10.3389/fmolb.2015.00004

_version_	1782370892365430784
author	Yang, Jun Zhao, Xinjie Lu, Xin Lin, Xiaohui Xu, Guowang
author_facet	Yang, Jun Zhao, Xinjie Lu, Xin Lin, Xiaohui Xu, Guowang
author_sort	Yang, Jun
collection	PubMed
description	Highlights: Developed a data preprocessing strategy to cope with missing values and mask effects in data analysis from high variation of abundant metabolites. A new method- ‘x-VAST’ was developed to amend the measurement deviation enlargement. Applying the above strategy, several low abundant masked differential metabolites were rescued. Metabolomics is a booming research field. Its success highly relies on the discovery of differential metabolites by comparing different data sets (for example, patients vs. controls). One of the challenges is that differences of the low abundant metabolites between groups are often masked by the high variation of abundant metabolites. In order to solve this challenge, a novel data preprocessing strategy consisting of three steps was proposed in this study. In step 1, a ‘modified 80%’ rule was used to reduce effect of missing values; in step 2, unit-variance and Pareto scaling methods were used to reduce the mask effect from the abundant metabolites. In step 3, in order to fix the adverse effect of scaling, stability information of the variables deduced from intensity information and the class information, was used to assign suitable weights to the variables. When applying to an LC/MS based metabolomics dataset from chronic hepatitis B patients study and two simulated datasets, the mask effect was found to be partially eliminated and several new low abundant differential metabolites were rescued.
format	Online Article Text
id	pubmed-4428451
institution	National Center for Biotechnology Information
language	English
publishDate	2015
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-44284512015-05-18 A data preprocessing strategy for metabolomics to reduce the mask effect in data analysis Yang, Jun Zhao, Xinjie Lu, Xin Lin, Xiaohui Xu, Guowang Front Mol Biosci Molecular Biosciences Highlights: Developed a data preprocessing strategy to cope with missing values and mask effects in data analysis from high variation of abundant metabolites. A new method- ‘x-VAST’ was developed to amend the measurement deviation enlargement. Applying the above strategy, several low abundant masked differential metabolites were rescued. Metabolomics is a booming research field. Its success highly relies on the discovery of differential metabolites by comparing different data sets (for example, patients vs. controls). One of the challenges is that differences of the low abundant metabolites between groups are often masked by the high variation of abundant metabolites. In order to solve this challenge, a novel data preprocessing strategy consisting of three steps was proposed in this study. In step 1, a ‘modified 80%’ rule was used to reduce effect of missing values; in step 2, unit-variance and Pareto scaling methods were used to reduce the mask effect from the abundant metabolites. In step 3, in order to fix the adverse effect of scaling, stability information of the variables deduced from intensity information and the class information, was used to assign suitable weights to the variables. When applying to an LC/MS based metabolomics dataset from chronic hepatitis B patients study and two simulated datasets, the mask effect was found to be partially eliminated and several new low abundant differential metabolites were rescued. Frontiers Media S.A. 2015-02-02 /pmc/articles/PMC4428451/ /pubmed/25988172 http://dx.doi.org/10.3389/fmolb.2015.00004 Text en Copyright © 2015 Yang, Zhao, Lu, Lin and Xu. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Molecular Biosciences Yang, Jun Zhao, Xinjie Lu, Xin Lin, Xiaohui Xu, Guowang A data preprocessing strategy for metabolomics to reduce the mask effect in data analysis
title	A data preprocessing strategy for metabolomics to reduce the mask effect in data analysis
title_full	A data preprocessing strategy for metabolomics to reduce the mask effect in data analysis
title_fullStr	A data preprocessing strategy for metabolomics to reduce the mask effect in data analysis
title_full_unstemmed	A data preprocessing strategy for metabolomics to reduce the mask effect in data analysis
title_short	A data preprocessing strategy for metabolomics to reduce the mask effect in data analysis
title_sort	data preprocessing strategy for metabolomics to reduce the mask effect in data analysis
topic	Molecular Biosciences
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4428451/ https://www.ncbi.nlm.nih.gov/pubmed/25988172 http://dx.doi.org/10.3389/fmolb.2015.00004
work_keys_str_mv	AT yangjun adatapreprocessingstrategyformetabolomicstoreducethemaskeffectindataanalysis AT zhaoxinjie adatapreprocessingstrategyformetabolomicstoreducethemaskeffectindataanalysis AT luxin adatapreprocessingstrategyformetabolomicstoreducethemaskeffectindataanalysis AT linxiaohui adatapreprocessingstrategyformetabolomicstoreducethemaskeffectindataanalysis AT xuguowang adatapreprocessingstrategyformetabolomicstoreducethemaskeffectindataanalysis AT yangjun datapreprocessingstrategyformetabolomicstoreducethemaskeffectindataanalysis AT zhaoxinjie datapreprocessingstrategyformetabolomicstoreducethemaskeffectindataanalysis AT luxin datapreprocessingstrategyformetabolomicstoreducethemaskeffectindataanalysis AT linxiaohui datapreprocessingstrategyformetabolomicstoreducethemaskeffectindataanalysis AT xuguowang datapreprocessingstrategyformetabolomicstoreducethemaskeffectindataanalysis

A data preprocessing strategy for metabolomics to reduce the mask effect in data analysis

Ejemplares similares