Cargando…
A comprehensive evaluation of microbial differential abundance analysis methods: current status and potential solutions
BACKGROUND: Differential abundance analysis (DAA) is one central statistical task in microbiome data analysis. A robust and powerful DAA tool can help identify highly confident microbial candidates for further biological validation. Numerous DAA tools have been proposed in the past decade addressing...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9392415/ https://www.ncbi.nlm.nih.gov/pubmed/35986393 http://dx.doi.org/10.1186/s40168-022-01320-0 |
_version_ | 1784771056947429376 |
---|---|
author | Yang, Lu Chen, Jun |
author_facet | Yang, Lu Chen, Jun |
author_sort | Yang, Lu |
collection | PubMed |
description | BACKGROUND: Differential abundance analysis (DAA) is one central statistical task in microbiome data analysis. A robust and powerful DAA tool can help identify highly confident microbial candidates for further biological validation. Numerous DAA tools have been proposed in the past decade addressing the special characteristics of microbiome data such as zero inflation and compositional effects. Disturbingly, different DAA tools could sometimes produce quite discordant results, opening to the possibility of cherry-picking the tool in favor of one’s own hypothesis. To recommend the best DAA tool or practice to the field, a comprehensive evaluation, which covers as many biologically relevant scenarios as possible, is critically needed. RESULTS: We performed by far the most comprehensive evaluation of existing DAA tools using real data-based simulations. We found that DAA methods explicitly addressing compositional effects such as ANCOM-BC, Aldex2, metagenomeSeq (fitFeatureModel), and DACOMP did have improved performance in false-positive control. But they are still not optimal: type 1 error inflation or low statistical power has been observed in many settings. The recent LDM method generally had the best power, but its false-positive control in the presence of strong compositional effects was not satisfactory. Overall, none of the evaluated methods is simultaneously robust, powerful, and flexible, which makes the selection of the best DAA tool difficult. To meet the analysis needs, we designed an optimized procedure, ZicoSeq, drawing on the strength of the existing DAA methods. We show that ZicoSeq generally controlled for false positives across settings, and the power was among the highest. Application of DAA methods to a large collection of real datasets revealed a similar pattern observed in simulation studies. CONCLUSIONS: Based on the benchmarking study, we conclude that none of the existing DAA methods evaluated can be applied blindly to any real microbiome dataset. The applicability of an existing DAA method depends on specific settings, which are usually unknown a priori. To circumvent the difficulty of selecting the best DAA tool in practice, we design ZicoSeq, which addresses the major challenges in DAA and remedies the drawbacks of existing DAA methods. ZicoSeq can be applied to microbiome datasets from diverse settings and is a useful DAA tool for robust microbiome biomarker discovery. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s40168-022-01320-0. |
format | Online Article Text |
id | pubmed-9392415 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-93924152022-08-21 A comprehensive evaluation of microbial differential abundance analysis methods: current status and potential solutions Yang, Lu Chen, Jun Microbiome Research BACKGROUND: Differential abundance analysis (DAA) is one central statistical task in microbiome data analysis. A robust and powerful DAA tool can help identify highly confident microbial candidates for further biological validation. Numerous DAA tools have been proposed in the past decade addressing the special characteristics of microbiome data such as zero inflation and compositional effects. Disturbingly, different DAA tools could sometimes produce quite discordant results, opening to the possibility of cherry-picking the tool in favor of one’s own hypothesis. To recommend the best DAA tool or practice to the field, a comprehensive evaluation, which covers as many biologically relevant scenarios as possible, is critically needed. RESULTS: We performed by far the most comprehensive evaluation of existing DAA tools using real data-based simulations. We found that DAA methods explicitly addressing compositional effects such as ANCOM-BC, Aldex2, metagenomeSeq (fitFeatureModel), and DACOMP did have improved performance in false-positive control. But they are still not optimal: type 1 error inflation or low statistical power has been observed in many settings. The recent LDM method generally had the best power, but its false-positive control in the presence of strong compositional effects was not satisfactory. Overall, none of the evaluated methods is simultaneously robust, powerful, and flexible, which makes the selection of the best DAA tool difficult. To meet the analysis needs, we designed an optimized procedure, ZicoSeq, drawing on the strength of the existing DAA methods. We show that ZicoSeq generally controlled for false positives across settings, and the power was among the highest. Application of DAA methods to a large collection of real datasets revealed a similar pattern observed in simulation studies. CONCLUSIONS: Based on the benchmarking study, we conclude that none of the existing DAA methods evaluated can be applied blindly to any real microbiome dataset. The applicability of an existing DAA method depends on specific settings, which are usually unknown a priori. To circumvent the difficulty of selecting the best DAA tool in practice, we design ZicoSeq, which addresses the major challenges in DAA and remedies the drawbacks of existing DAA methods. ZicoSeq can be applied to microbiome datasets from diverse settings and is a useful DAA tool for robust microbiome biomarker discovery. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s40168-022-01320-0. BioMed Central 2022-08-19 /pmc/articles/PMC9392415/ /pubmed/35986393 http://dx.doi.org/10.1186/s40168-022-01320-0 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Yang, Lu Chen, Jun A comprehensive evaluation of microbial differential abundance analysis methods: current status and potential solutions |
title | A comprehensive evaluation of microbial differential abundance analysis methods: current status and potential solutions |
title_full | A comprehensive evaluation of microbial differential abundance analysis methods: current status and potential solutions |
title_fullStr | A comprehensive evaluation of microbial differential abundance analysis methods: current status and potential solutions |
title_full_unstemmed | A comprehensive evaluation of microbial differential abundance analysis methods: current status and potential solutions |
title_short | A comprehensive evaluation of microbial differential abundance analysis methods: current status and potential solutions |
title_sort | comprehensive evaluation of microbial differential abundance analysis methods: current status and potential solutions |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9392415/ https://www.ncbi.nlm.nih.gov/pubmed/35986393 http://dx.doi.org/10.1186/s40168-022-01320-0 |
work_keys_str_mv | AT yanglu acomprehensiveevaluationofmicrobialdifferentialabundanceanalysismethodscurrentstatusandpotentialsolutions AT chenjun acomprehensiveevaluationofmicrobialdifferentialabundanceanalysismethodscurrentstatusandpotentialsolutions AT yanglu comprehensiveevaluationofmicrobialdifferentialabundanceanalysismethodscurrentstatusandpotentialsolutions AT chenjun comprehensiveevaluationofmicrobialdifferentialabundanceanalysismethodscurrentstatusandpotentialsolutions |