Cargando…

mbDenoise: microbiome data denoising using zero-inflated probabilistic principal components analysis

The analysis of microbiome data has several technical challenges. In particular, count matrices contain a large proportion of zeros, some of which are biological, whereas others are technical. Furthermore, the measurements suffer from unequal sequencing depth, overdispersion, and data redundancy. Th...

Descripción completa

Detalles Bibliográficos
Autores principales: Zeng, Yanyan, Li, Jing, Wei, Chaochun, Zhao, Hongyu, Tao, Wang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9011970/
https://www.ncbi.nlm.nih.gov/pubmed/35422001
http://dx.doi.org/10.1186/s13059-022-02657-3
Descripción
Sumario:The analysis of microbiome data has several technical challenges. In particular, count matrices contain a large proportion of zeros, some of which are biological, whereas others are technical. Furthermore, the measurements suffer from unequal sequencing depth, overdispersion, and data redundancy. These nuisance factors introduce substantial noise. We propose an accurate and robust method, mbDenoise, for denoising microbiome data. Assuming a zero-inflated probabilistic PCA (ZIPPCA) model, mbDenoise uses variational approximation to learn the latent structure and recovers the true abundance levels using the posterior, borrowing information across samples and taxa. mbDenoise outperforms state-of-the-art methods to extract the signal for downstream analyses. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at (10.1186/s13059-022-02657-3).