Cargando…
GSimp: A Gibbs sampler based left-censored missing value imputation approach for metabolomics studies
Left-censored missing values commonly exist in targeted metabolomics datasets and can be considered as missing not at random (MNAR). Improper data processing procedures for missing values will cause adverse impacts on subsequent statistical analyses. However, few imputation methods have been develop...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5809088/ https://www.ncbi.nlm.nih.gov/pubmed/29385130 http://dx.doi.org/10.1371/journal.pcbi.1005973 |
_version_ | 1783299530670735360 |
---|---|
author | Wei, Runmin Wang, Jingye Jia, Erik Chen, Tianlu Ni, Yan Jia, Wei |
author_facet | Wei, Runmin Wang, Jingye Jia, Erik Chen, Tianlu Ni, Yan Jia, Wei |
author_sort | Wei, Runmin |
collection | PubMed |
description | Left-censored missing values commonly exist in targeted metabolomics datasets and can be considered as missing not at random (MNAR). Improper data processing procedures for missing values will cause adverse impacts on subsequent statistical analyses. However, few imputation methods have been developed and applied to the situation of MNAR in the field of metabolomics. Thus, a practical left-censored missing value imputation method is urgently needed. We developed an iterative Gibbs sampler based left-censored missing value imputation approach (GSimp). We compared GSimp with other three imputation methods on two real-world targeted metabolomics datasets and one simulation dataset using our imputation evaluation pipeline. The results show that GSimp outperforms other imputation methods in terms of imputation accuracy, observation distribution, univariate and multivariate analyses, and statistical sensitivity. Additionally, a parallel version of GSimp was developed for dealing with large scale metabolomics datasets. The R code for GSimp, evaluation pipeline, tutorial, real-world and simulated targeted metabolomics datasets are available at: https://github.com/WandeRum/GSimp. |
format | Online Article Text |
id | pubmed-5809088 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-58090882018-02-28 GSimp: A Gibbs sampler based left-censored missing value imputation approach for metabolomics studies Wei, Runmin Wang, Jingye Jia, Erik Chen, Tianlu Ni, Yan Jia, Wei PLoS Comput Biol Research Article Left-censored missing values commonly exist in targeted metabolomics datasets and can be considered as missing not at random (MNAR). Improper data processing procedures for missing values will cause adverse impacts on subsequent statistical analyses. However, few imputation methods have been developed and applied to the situation of MNAR in the field of metabolomics. Thus, a practical left-censored missing value imputation method is urgently needed. We developed an iterative Gibbs sampler based left-censored missing value imputation approach (GSimp). We compared GSimp with other three imputation methods on two real-world targeted metabolomics datasets and one simulation dataset using our imputation evaluation pipeline. The results show that GSimp outperforms other imputation methods in terms of imputation accuracy, observation distribution, univariate and multivariate analyses, and statistical sensitivity. Additionally, a parallel version of GSimp was developed for dealing with large scale metabolomics datasets. The R code for GSimp, evaluation pipeline, tutorial, real-world and simulated targeted metabolomics datasets are available at: https://github.com/WandeRum/GSimp. Public Library of Science 2018-01-31 /pmc/articles/PMC5809088/ /pubmed/29385130 http://dx.doi.org/10.1371/journal.pcbi.1005973 Text en © 2018 Wei et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Wei, Runmin Wang, Jingye Jia, Erik Chen, Tianlu Ni, Yan Jia, Wei GSimp: A Gibbs sampler based left-censored missing value imputation approach for metabolomics studies |
title | GSimp: A Gibbs sampler based left-censored missing value imputation approach for metabolomics studies |
title_full | GSimp: A Gibbs sampler based left-censored missing value imputation approach for metabolomics studies |
title_fullStr | GSimp: A Gibbs sampler based left-censored missing value imputation approach for metabolomics studies |
title_full_unstemmed | GSimp: A Gibbs sampler based left-censored missing value imputation approach for metabolomics studies |
title_short | GSimp: A Gibbs sampler based left-censored missing value imputation approach for metabolomics studies |
title_sort | gsimp: a gibbs sampler based left-censored missing value imputation approach for metabolomics studies |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5809088/ https://www.ncbi.nlm.nih.gov/pubmed/29385130 http://dx.doi.org/10.1371/journal.pcbi.1005973 |
work_keys_str_mv | AT weirunmin gsimpagibbssamplerbasedleftcensoredmissingvalueimputationapproachformetabolomicsstudies AT wangjingye gsimpagibbssamplerbasedleftcensoredmissingvalueimputationapproachformetabolomicsstudies AT jiaerik gsimpagibbssamplerbasedleftcensoredmissingvalueimputationapproachformetabolomicsstudies AT chentianlu gsimpagibbssamplerbasedleftcensoredmissingvalueimputationapproachformetabolomicsstudies AT niyan gsimpagibbssamplerbasedleftcensoredmissingvalueimputationapproachformetabolomicsstudies AT jiawei gsimpagibbssamplerbasedleftcensoredmissingvalueimputationapproachformetabolomicsstudies |