Cargando…

A comprehensive evaluation of microbial differential abundance analysis methods: current status and potential solutions

BACKGROUND: Differential abundance analysis (DAA) is one central statistical task in microbiome data analysis. A robust and powerful DAA tool can help identify highly confident microbial candidates for further biological validation. Numerous DAA tools have been proposed in the past decade addressing...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Lu, Chen, Jun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9392415/
https://www.ncbi.nlm.nih.gov/pubmed/35986393
http://dx.doi.org/10.1186/s40168-022-01320-0
_version_ 1784771056947429376
author Yang, Lu
Chen, Jun
author_facet Yang, Lu
Chen, Jun
author_sort Yang, Lu
collection PubMed
description BACKGROUND: Differential abundance analysis (DAA) is one central statistical task in microbiome data analysis. A robust and powerful DAA tool can help identify highly confident microbial candidates for further biological validation. Numerous DAA tools have been proposed in the past decade addressing the special characteristics of microbiome data such as zero inflation and compositional effects. Disturbingly, different DAA tools could sometimes produce quite discordant results, opening to the possibility of cherry-picking the tool in favor of one’s own hypothesis. To recommend the best DAA tool or practice to the field, a comprehensive evaluation, which covers as many biologically relevant scenarios as possible, is critically needed. RESULTS: We performed by far the most comprehensive evaluation of existing DAA tools using real data-based simulations. We found that DAA methods explicitly addressing compositional effects such as ANCOM-BC, Aldex2, metagenomeSeq (fitFeatureModel), and DACOMP did have improved performance in false-positive control. But they are still not optimal: type 1 error inflation or low statistical power has been observed in many settings. The recent LDM method generally had the best power, but its false-positive control in the presence of strong compositional effects was not satisfactory. Overall, none of the evaluated methods is simultaneously robust, powerful, and flexible, which makes the selection of the best DAA tool difficult. To meet the analysis needs, we designed an optimized procedure, ZicoSeq, drawing on the strength of the existing DAA methods. We show that ZicoSeq generally controlled for false positives across settings, and the power was among the highest. Application of DAA methods to a large collection of real datasets revealed a similar pattern observed in simulation studies. CONCLUSIONS: Based on the benchmarking study, we conclude that none of the existing DAA methods evaluated can be applied blindly to any real microbiome dataset. The applicability of an existing DAA method depends on specific settings, which are usually unknown a priori. To circumvent the difficulty of selecting the best DAA tool in practice, we design ZicoSeq, which addresses the major challenges in DAA and remedies the drawbacks of existing DAA methods. ZicoSeq can be applied to microbiome datasets from diverse settings and is a useful DAA tool for robust microbiome biomarker discovery. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s40168-022-01320-0.
format Online
Article
Text
id pubmed-9392415
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-93924152022-08-21 A comprehensive evaluation of microbial differential abundance analysis methods: current status and potential solutions Yang, Lu Chen, Jun Microbiome Research BACKGROUND: Differential abundance analysis (DAA) is one central statistical task in microbiome data analysis. A robust and powerful DAA tool can help identify highly confident microbial candidates for further biological validation. Numerous DAA tools have been proposed in the past decade addressing the special characteristics of microbiome data such as zero inflation and compositional effects. Disturbingly, different DAA tools could sometimes produce quite discordant results, opening to the possibility of cherry-picking the tool in favor of one’s own hypothesis. To recommend the best DAA tool or practice to the field, a comprehensive evaluation, which covers as many biologically relevant scenarios as possible, is critically needed. RESULTS: We performed by far the most comprehensive evaluation of existing DAA tools using real data-based simulations. We found that DAA methods explicitly addressing compositional effects such as ANCOM-BC, Aldex2, metagenomeSeq (fitFeatureModel), and DACOMP did have improved performance in false-positive control. But they are still not optimal: type 1 error inflation or low statistical power has been observed in many settings. The recent LDM method generally had the best power, but its false-positive control in the presence of strong compositional effects was not satisfactory. Overall, none of the evaluated methods is simultaneously robust, powerful, and flexible, which makes the selection of the best DAA tool difficult. To meet the analysis needs, we designed an optimized procedure, ZicoSeq, drawing on the strength of the existing DAA methods. We show that ZicoSeq generally controlled for false positives across settings, and the power was among the highest. Application of DAA methods to a large collection of real datasets revealed a similar pattern observed in simulation studies. CONCLUSIONS: Based on the benchmarking study, we conclude that none of the existing DAA methods evaluated can be applied blindly to any real microbiome dataset. The applicability of an existing DAA method depends on specific settings, which are usually unknown a priori. To circumvent the difficulty of selecting the best DAA tool in practice, we design ZicoSeq, which addresses the major challenges in DAA and remedies the drawbacks of existing DAA methods. ZicoSeq can be applied to microbiome datasets from diverse settings and is a useful DAA tool for robust microbiome biomarker discovery. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s40168-022-01320-0. BioMed Central 2022-08-19 /pmc/articles/PMC9392415/ /pubmed/35986393 http://dx.doi.org/10.1186/s40168-022-01320-0 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Yang, Lu
Chen, Jun
A comprehensive evaluation of microbial differential abundance analysis methods: current status and potential solutions
title A comprehensive evaluation of microbial differential abundance analysis methods: current status and potential solutions
title_full A comprehensive evaluation of microbial differential abundance analysis methods: current status and potential solutions
title_fullStr A comprehensive evaluation of microbial differential abundance analysis methods: current status and potential solutions
title_full_unstemmed A comprehensive evaluation of microbial differential abundance analysis methods: current status and potential solutions
title_short A comprehensive evaluation of microbial differential abundance analysis methods: current status and potential solutions
title_sort comprehensive evaluation of microbial differential abundance analysis methods: current status and potential solutions
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9392415/
https://www.ncbi.nlm.nih.gov/pubmed/35986393
http://dx.doi.org/10.1186/s40168-022-01320-0
work_keys_str_mv AT yanglu acomprehensiveevaluationofmicrobialdifferentialabundanceanalysismethodscurrentstatusandpotentialsolutions
AT chenjun acomprehensiveevaluationofmicrobialdifferentialabundanceanalysismethodscurrentstatusandpotentialsolutions
AT yanglu comprehensiveevaluationofmicrobialdifferentialabundanceanalysismethodscurrentstatusandpotentialsolutions
AT chenjun comprehensiveevaluationofmicrobialdifferentialabundanceanalysismethodscurrentstatusandpotentialsolutions