Cargando…

Robust Differential Abundance Analysis of Microbiome Sequencing Data

It is well known that the microbiome data are ridden with outliers and have heavy distribution tails, but the impact of outliers and heavy-tailedness has yet to be examined systematically. This paper investigates the impact of outliers and heavy-tailedness on differential abundance analysis (DAA) us...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Guanxun, Yang, Lu, Chen, Jun, Zhang, Xianyang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10671797/
https://www.ncbi.nlm.nih.gov/pubmed/38002943
http://dx.doi.org/10.3390/genes14112000
_version_ 1785140241397448704
author Li, Guanxun
Yang, Lu
Chen, Jun
Zhang, Xianyang
author_facet Li, Guanxun
Yang, Lu
Chen, Jun
Zhang, Xianyang
author_sort Li, Guanxun
collection PubMed
description It is well known that the microbiome data are ridden with outliers and have heavy distribution tails, but the impact of outliers and heavy-tailedness has yet to be examined systematically. This paper investigates the impact of outliers and heavy-tailedness on differential abundance analysis (DAA) using the linear models for the differential abundance analysis (LinDA) method and proposes effective strategies to mitigate their influence. The presence of outliers and heavy-tailedness can significantly decrease the power of LinDA. We investigate various techniques to address outliers and heavy-tailedness, including generalizing LinDA into a more flexible framework that allows for the use of robust regression and winsorizing the data before applying LinDA. Our extensive numerical experiments and real-data analyses demonstrate that robust Huber regression has overall the best performance in addressing outliers and heavy-tailedness.
format Online
Article
Text
id pubmed-10671797
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-106717972023-10-26 Robust Differential Abundance Analysis of Microbiome Sequencing Data Li, Guanxun Yang, Lu Chen, Jun Zhang, Xianyang Genes (Basel) Article It is well known that the microbiome data are ridden with outliers and have heavy distribution tails, but the impact of outliers and heavy-tailedness has yet to be examined systematically. This paper investigates the impact of outliers and heavy-tailedness on differential abundance analysis (DAA) using the linear models for the differential abundance analysis (LinDA) method and proposes effective strategies to mitigate their influence. The presence of outliers and heavy-tailedness can significantly decrease the power of LinDA. We investigate various techniques to address outliers and heavy-tailedness, including generalizing LinDA into a more flexible framework that allows for the use of robust regression and winsorizing the data before applying LinDA. Our extensive numerical experiments and real-data analyses demonstrate that robust Huber regression has overall the best performance in addressing outliers and heavy-tailedness. MDPI 2023-10-26 /pmc/articles/PMC10671797/ /pubmed/38002943 http://dx.doi.org/10.3390/genes14112000 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Li, Guanxun
Yang, Lu
Chen, Jun
Zhang, Xianyang
Robust Differential Abundance Analysis of Microbiome Sequencing Data
title Robust Differential Abundance Analysis of Microbiome Sequencing Data
title_full Robust Differential Abundance Analysis of Microbiome Sequencing Data
title_fullStr Robust Differential Abundance Analysis of Microbiome Sequencing Data
title_full_unstemmed Robust Differential Abundance Analysis of Microbiome Sequencing Data
title_short Robust Differential Abundance Analysis of Microbiome Sequencing Data
title_sort robust differential abundance analysis of microbiome sequencing data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10671797/
https://www.ncbi.nlm.nih.gov/pubmed/38002943
http://dx.doi.org/10.3390/genes14112000
work_keys_str_mv AT liguanxun robustdifferentialabundanceanalysisofmicrobiomesequencingdata
AT yanglu robustdifferentialabundanceanalysisofmicrobiomesequencingdata
AT chenjun robustdifferentialabundanceanalysisofmicrobiomesequencingdata
AT zhangxianyang robustdifferentialabundanceanalysisofmicrobiomesequencingdata