Cargando…

GMPR: A robust normalization method for zero-inflated count data with application to microbiome sequencing data

Normalization is the first critical step in microbiome sequencing data analysis used to account for variable library sizes. Current RNA-Seq based normalization methods that have been adapted for microbiome data fail to consider the unique characteristics of microbiome data, which contain a vast numb...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Li, Reeve, James, Zhang, Lujun, Huang, Shengbing, Wang, Xuefeng, Chen, Jun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5885979/
https://www.ncbi.nlm.nih.gov/pubmed/29629248
http://dx.doi.org/10.7717/peerj.4600
_version_ 1783312066315026432
author Chen, Li
Reeve, James
Zhang, Lujun
Huang, Shengbing
Wang, Xuefeng
Chen, Jun
author_facet Chen, Li
Reeve, James
Zhang, Lujun
Huang, Shengbing
Wang, Xuefeng
Chen, Jun
author_sort Chen, Li
collection PubMed
description Normalization is the first critical step in microbiome sequencing data analysis used to account for variable library sizes. Current RNA-Seq based normalization methods that have been adapted for microbiome data fail to consider the unique characteristics of microbiome data, which contain a vast number of zeros due to the physical absence or under-sampling of the microbes. Normalization methods that specifically address the zero-inflation remain largely undeveloped. Here we propose geometric mean of pairwise ratios—a simple but effective normalization method—for zero-inflated sequencing data such as microbiome data. Simulation studies and real datasets analyses demonstrate that the proposed method is more robust than competing methods, leading to more powerful detection of differentially abundant taxa and higher reproducibility of the relative abundances of taxa.
format Online
Article
Text
id pubmed-5885979
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-58859792018-04-06 GMPR: A robust normalization method for zero-inflated count data with application to microbiome sequencing data Chen, Li Reeve, James Zhang, Lujun Huang, Shengbing Wang, Xuefeng Chen, Jun PeerJ Bioinformatics Normalization is the first critical step in microbiome sequencing data analysis used to account for variable library sizes. Current RNA-Seq based normalization methods that have been adapted for microbiome data fail to consider the unique characteristics of microbiome data, which contain a vast number of zeros due to the physical absence or under-sampling of the microbes. Normalization methods that specifically address the zero-inflation remain largely undeveloped. Here we propose geometric mean of pairwise ratios—a simple but effective normalization method—for zero-inflated sequencing data such as microbiome data. Simulation studies and real datasets analyses demonstrate that the proposed method is more robust than competing methods, leading to more powerful detection of differentially abundant taxa and higher reproducibility of the relative abundances of taxa. PeerJ Inc. 2018-04-02 /pmc/articles/PMC5885979/ /pubmed/29629248 http://dx.doi.org/10.7717/peerj.4600 Text en © 2018 Chen et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Chen, Li
Reeve, James
Zhang, Lujun
Huang, Shengbing
Wang, Xuefeng
Chen, Jun
GMPR: A robust normalization method for zero-inflated count data with application to microbiome sequencing data
title GMPR: A robust normalization method for zero-inflated count data with application to microbiome sequencing data
title_full GMPR: A robust normalization method for zero-inflated count data with application to microbiome sequencing data
title_fullStr GMPR: A robust normalization method for zero-inflated count data with application to microbiome sequencing data
title_full_unstemmed GMPR: A robust normalization method for zero-inflated count data with application to microbiome sequencing data
title_short GMPR: A robust normalization method for zero-inflated count data with application to microbiome sequencing data
title_sort gmpr: a robust normalization method for zero-inflated count data with application to microbiome sequencing data
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5885979/
https://www.ncbi.nlm.nih.gov/pubmed/29629248
http://dx.doi.org/10.7717/peerj.4600
work_keys_str_mv AT chenli gmprarobustnormalizationmethodforzeroinflatedcountdatawithapplicationtomicrobiomesequencingdata
AT reevejames gmprarobustnormalizationmethodforzeroinflatedcountdatawithapplicationtomicrobiomesequencingdata
AT zhanglujun gmprarobustnormalizationmethodforzeroinflatedcountdatawithapplicationtomicrobiomesequencingdata
AT huangshengbing gmprarobustnormalizationmethodforzeroinflatedcountdatawithapplicationtomicrobiomesequencingdata
AT wangxuefeng gmprarobustnormalizationmethodforzeroinflatedcountdatawithapplicationtomicrobiomesequencingdata
AT chenjun gmprarobustnormalizationmethodforzeroinflatedcountdatawithapplicationtomicrobiomesequencingdata