Cargando…

A latent allocation model for the analysis of microbial composition and disease

BACKGROUND: Establishing the relationship between microbiota and specific diseases is important but requires appropriate statistical methodology. A specialized feature of microbiome count data is the presence of a large number of zeros, which makes it difficult to analyze in case-control studies. Mo...

Descripción completa

Detalles Bibliográficos
Autores principales: Abe, Ko, Hirayama, Masaaki, Ohno, Kinji, Shimamura, Teppei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6311924/
https://www.ncbi.nlm.nih.gov/pubmed/30598099
http://dx.doi.org/10.1186/s12859-018-2530-6
_version_ 1783383702539075584
author Abe, Ko
Hirayama, Masaaki
Ohno, Kinji
Shimamura, Teppei
author_facet Abe, Ko
Hirayama, Masaaki
Ohno, Kinji
Shimamura, Teppei
author_sort Abe, Ko
collection PubMed
description BACKGROUND: Establishing the relationship between microbiota and specific diseases is important but requires appropriate statistical methodology. A specialized feature of microbiome count data is the presence of a large number of zeros, which makes it difficult to analyze in case-control studies. Most existing approaches either add a small number called a pseudo-count or use probability models such as the multinomial and Dirichlet-multinomial distributions to explain the excess zero counts, which may produce unnecessary biases and impose a correlation structure taht is unsuitable for microbiome data. RESULTS: The purpose of this article is to develop a new probabilistic model, called BERnoulli and MUltinomial Distribution-based latent Allocation (BERMUDA), to address these problems. BERMUDA enables us to describe the differences in bacteria composition and a certain disease among samples. We also provide a simple and efficient learning procedure for the proposed model using an annealing EM algorithm. CONCLUSION: We illustrate the performance of the proposed method both through both the simulation and real data analysis. BERMUDA is implemented with R and is available from GitHub (https://github.com/abikoushi/Bermuda).
format Online
Article
Text
id pubmed-6311924
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-63119242019-01-07 A latent allocation model for the analysis of microbial composition and disease Abe, Ko Hirayama, Masaaki Ohno, Kinji Shimamura, Teppei BMC Bioinformatics Research BACKGROUND: Establishing the relationship between microbiota and specific diseases is important but requires appropriate statistical methodology. A specialized feature of microbiome count data is the presence of a large number of zeros, which makes it difficult to analyze in case-control studies. Most existing approaches either add a small number called a pseudo-count or use probability models such as the multinomial and Dirichlet-multinomial distributions to explain the excess zero counts, which may produce unnecessary biases and impose a correlation structure taht is unsuitable for microbiome data. RESULTS: The purpose of this article is to develop a new probabilistic model, called BERnoulli and MUltinomial Distribution-based latent Allocation (BERMUDA), to address these problems. BERMUDA enables us to describe the differences in bacteria composition and a certain disease among samples. We also provide a simple and efficient learning procedure for the proposed model using an annealing EM algorithm. CONCLUSION: We illustrate the performance of the proposed method both through both the simulation and real data analysis. BERMUDA is implemented with R and is available from GitHub (https://github.com/abikoushi/Bermuda). BioMed Central 2018-12-31 /pmc/articles/PMC6311924/ /pubmed/30598099 http://dx.doi.org/10.1186/s12859-018-2530-6 Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Abe, Ko
Hirayama, Masaaki
Ohno, Kinji
Shimamura, Teppei
A latent allocation model for the analysis of microbial composition and disease
title A latent allocation model for the analysis of microbial composition and disease
title_full A latent allocation model for the analysis of microbial composition and disease
title_fullStr A latent allocation model for the analysis of microbial composition and disease
title_full_unstemmed A latent allocation model for the analysis of microbial composition and disease
title_short A latent allocation model for the analysis of microbial composition and disease
title_sort latent allocation model for the analysis of microbial composition and disease
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6311924/
https://www.ncbi.nlm.nih.gov/pubmed/30598099
http://dx.doi.org/10.1186/s12859-018-2530-6
work_keys_str_mv AT abeko alatentallocationmodelfortheanalysisofmicrobialcompositionanddisease
AT hirayamamasaaki alatentallocationmodelfortheanalysisofmicrobialcompositionanddisease
AT ohnokinji alatentallocationmodelfortheanalysisofmicrobialcompositionanddisease
AT shimamurateppei alatentallocationmodelfortheanalysisofmicrobialcompositionanddisease
AT abeko latentallocationmodelfortheanalysisofmicrobialcompositionanddisease
AT hirayamamasaaki latentallocationmodelfortheanalysisofmicrobialcompositionanddisease
AT ohnokinji latentallocationmodelfortheanalysisofmicrobialcompositionanddisease
AT shimamurateppei latentallocationmodelfortheanalysisofmicrobialcompositionanddisease