Cargando…

GSimp: A Gibbs sampler based left-censored missing value imputation approach for metabolomics studies

Left-censored missing values commonly exist in targeted metabolomics datasets and can be considered as missing not at random (MNAR). Improper data processing procedures for missing values will cause adverse impacts on subsequent statistical analyses. However, few imputation methods have been develop...

Descripción completa

Detalles Bibliográficos
Autores principales: Wei, Runmin, Wang, Jingye, Jia, Erik, Chen, Tianlu, Ni, Yan, Jia, Wei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5809088/
https://www.ncbi.nlm.nih.gov/pubmed/29385130
http://dx.doi.org/10.1371/journal.pcbi.1005973
_version_ 1783299530670735360
author Wei, Runmin
Wang, Jingye
Jia, Erik
Chen, Tianlu
Ni, Yan
Jia, Wei
author_facet Wei, Runmin
Wang, Jingye
Jia, Erik
Chen, Tianlu
Ni, Yan
Jia, Wei
author_sort Wei, Runmin
collection PubMed
description Left-censored missing values commonly exist in targeted metabolomics datasets and can be considered as missing not at random (MNAR). Improper data processing procedures for missing values will cause adverse impacts on subsequent statistical analyses. However, few imputation methods have been developed and applied to the situation of MNAR in the field of metabolomics. Thus, a practical left-censored missing value imputation method is urgently needed. We developed an iterative Gibbs sampler based left-censored missing value imputation approach (GSimp). We compared GSimp with other three imputation methods on two real-world targeted metabolomics datasets and one simulation dataset using our imputation evaluation pipeline. The results show that GSimp outperforms other imputation methods in terms of imputation accuracy, observation distribution, univariate and multivariate analyses, and statistical sensitivity. Additionally, a parallel version of GSimp was developed for dealing with large scale metabolomics datasets. The R code for GSimp, evaluation pipeline, tutorial, real-world and simulated targeted metabolomics datasets are available at: https://github.com/WandeRum/GSimp.
format Online
Article
Text
id pubmed-5809088
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-58090882018-02-28 GSimp: A Gibbs sampler based left-censored missing value imputation approach for metabolomics studies Wei, Runmin Wang, Jingye Jia, Erik Chen, Tianlu Ni, Yan Jia, Wei PLoS Comput Biol Research Article Left-censored missing values commonly exist in targeted metabolomics datasets and can be considered as missing not at random (MNAR). Improper data processing procedures for missing values will cause adverse impacts on subsequent statistical analyses. However, few imputation methods have been developed and applied to the situation of MNAR in the field of metabolomics. Thus, a practical left-censored missing value imputation method is urgently needed. We developed an iterative Gibbs sampler based left-censored missing value imputation approach (GSimp). We compared GSimp with other three imputation methods on two real-world targeted metabolomics datasets and one simulation dataset using our imputation evaluation pipeline. The results show that GSimp outperforms other imputation methods in terms of imputation accuracy, observation distribution, univariate and multivariate analyses, and statistical sensitivity. Additionally, a parallel version of GSimp was developed for dealing with large scale metabolomics datasets. The R code for GSimp, evaluation pipeline, tutorial, real-world and simulated targeted metabolomics datasets are available at: https://github.com/WandeRum/GSimp. Public Library of Science 2018-01-31 /pmc/articles/PMC5809088/ /pubmed/29385130 http://dx.doi.org/10.1371/journal.pcbi.1005973 Text en © 2018 Wei et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Wei, Runmin
Wang, Jingye
Jia, Erik
Chen, Tianlu
Ni, Yan
Jia, Wei
GSimp: A Gibbs sampler based left-censored missing value imputation approach for metabolomics studies
title GSimp: A Gibbs sampler based left-censored missing value imputation approach for metabolomics studies
title_full GSimp: A Gibbs sampler based left-censored missing value imputation approach for metabolomics studies
title_fullStr GSimp: A Gibbs sampler based left-censored missing value imputation approach for metabolomics studies
title_full_unstemmed GSimp: A Gibbs sampler based left-censored missing value imputation approach for metabolomics studies
title_short GSimp: A Gibbs sampler based left-censored missing value imputation approach for metabolomics studies
title_sort gsimp: a gibbs sampler based left-censored missing value imputation approach for metabolomics studies
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5809088/
https://www.ncbi.nlm.nih.gov/pubmed/29385130
http://dx.doi.org/10.1371/journal.pcbi.1005973
work_keys_str_mv AT weirunmin gsimpagibbssamplerbasedleftcensoredmissingvalueimputationapproachformetabolomicsstudies
AT wangjingye gsimpagibbssamplerbasedleftcensoredmissingvalueimputationapproachformetabolomicsstudies
AT jiaerik gsimpagibbssamplerbasedleftcensoredmissingvalueimputationapproachformetabolomicsstudies
AT chentianlu gsimpagibbssamplerbasedleftcensoredmissingvalueimputationapproachformetabolomicsstudies
AT niyan gsimpagibbssamplerbasedleftcensoredmissingvalueimputationapproachformetabolomicsstudies
AT jiawei gsimpagibbssamplerbasedleftcensoredmissingvalueimputationapproachformetabolomicsstudies