Cargando…

An empirical Bayes approach to normalization and differential abundance testing for microbiome data

BACKGROUND: Advances in DNA sequencing have offered researchers an unprecedented opportunity to better study the variety of species living in and on the human body. However, the analysis of microbiome data is complicated by several challenges. First, the sequencing depth may vary by orders of magnit...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Tiantian, Zhao, Hongyu, Wang, Tao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7268703/
https://www.ncbi.nlm.nih.gov/pubmed/32493208
http://dx.doi.org/10.1186/s12859-020-03552-z
_version_ 1783541675457511424
author Liu, Tiantian
Zhao, Hongyu
Wang, Tao
author_facet Liu, Tiantian
Zhao, Hongyu
Wang, Tao
author_sort Liu, Tiantian
collection PubMed
description BACKGROUND: Advances in DNA sequencing have offered researchers an unprecedented opportunity to better study the variety of species living in and on the human body. However, the analysis of microbiome data is complicated by several challenges. First, the sequencing depth may vary by orders of magnitude across samples. Second, species are rare and the data often contain many zeros. Third, the specimen is a fraction of the microbial ecosystem, and so the data are compositional carrying only relative information. Other characteristics of microbiome data include pronounced over-dispersion in taxon abundances, and the existence of a phylogenetic tree that relates all bacterial species. To address some of these challenges, microbiome analysis workflows often normalize the read counts prior to downstream analysis. However, there are limitations in the current literature on the normalization of microbiome data. RESULTS: Under the multinomial distribution for the read counts and a prior for the unknown proportions, we propose an empirical Bayes approach to microbiome data normalization. Using a tree-based extension of the Dirichlet prior, we further extend our method by incorporating the phylogenetic tree into the normalization process. We study the impact of normalization on differential abundance analysis. In the presence of tree structure, we propose a phylogeny-aware detection procedure. CONCLUSIONS: Extensive simulations and gut microbiome data applications are conducted to demonstrate the superior performance of our empirical Bayes method over other normalization methods, and over commonly-used methods for differential abundance testing. Original R scripts are available at GitHub (https://github.com/liudoubletian/eBay).
format Online
Article
Text
id pubmed-7268703
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-72687032020-06-08 An empirical Bayes approach to normalization and differential abundance testing for microbiome data Liu, Tiantian Zhao, Hongyu Wang, Tao BMC Bioinformatics Methodology Article BACKGROUND: Advances in DNA sequencing have offered researchers an unprecedented opportunity to better study the variety of species living in and on the human body. However, the analysis of microbiome data is complicated by several challenges. First, the sequencing depth may vary by orders of magnitude across samples. Second, species are rare and the data often contain many zeros. Third, the specimen is a fraction of the microbial ecosystem, and so the data are compositional carrying only relative information. Other characteristics of microbiome data include pronounced over-dispersion in taxon abundances, and the existence of a phylogenetic tree that relates all bacterial species. To address some of these challenges, microbiome analysis workflows often normalize the read counts prior to downstream analysis. However, there are limitations in the current literature on the normalization of microbiome data. RESULTS: Under the multinomial distribution for the read counts and a prior for the unknown proportions, we propose an empirical Bayes approach to microbiome data normalization. Using a tree-based extension of the Dirichlet prior, we further extend our method by incorporating the phylogenetic tree into the normalization process. We study the impact of normalization on differential abundance analysis. In the presence of tree structure, we propose a phylogeny-aware detection procedure. CONCLUSIONS: Extensive simulations and gut microbiome data applications are conducted to demonstrate the superior performance of our empirical Bayes method over other normalization methods, and over commonly-used methods for differential abundance testing. Original R scripts are available at GitHub (https://github.com/liudoubletian/eBay). BioMed Central 2020-06-03 /pmc/articles/PMC7268703/ /pubmed/32493208 http://dx.doi.org/10.1186/s12859-020-03552-z Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology Article
Liu, Tiantian
Zhao, Hongyu
Wang, Tao
An empirical Bayes approach to normalization and differential abundance testing for microbiome data
title An empirical Bayes approach to normalization and differential abundance testing for microbiome data
title_full An empirical Bayes approach to normalization and differential abundance testing for microbiome data
title_fullStr An empirical Bayes approach to normalization and differential abundance testing for microbiome data
title_full_unstemmed An empirical Bayes approach to normalization and differential abundance testing for microbiome data
title_short An empirical Bayes approach to normalization and differential abundance testing for microbiome data
title_sort empirical bayes approach to normalization and differential abundance testing for microbiome data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7268703/
https://www.ncbi.nlm.nih.gov/pubmed/32493208
http://dx.doi.org/10.1186/s12859-020-03552-z
work_keys_str_mv AT liutiantian anempiricalbayesapproachtonormalizationanddifferentialabundancetestingformicrobiomedata
AT zhaohongyu anempiricalbayesapproachtonormalizationanddifferentialabundancetestingformicrobiomedata
AT wangtao anempiricalbayesapproachtonormalizationanddifferentialabundancetestingformicrobiomedata
AT liutiantian empiricalbayesapproachtonormalizationanddifferentialabundancetestingformicrobiomedata
AT zhaohongyu empiricalbayesapproachtonormalizationanddifferentialabundancetestingformicrobiomedata
AT wangtao empiricalbayesapproachtonormalizationanddifferentialabundancetestingformicrobiomedata