Cargando…

coda4microbiome: compositional data analysis for microbiome cross-sectional and longitudinal studies

BACKGROUND: One of the main challenges of microbiome analysis is its compositional nature that if ignored can lead to spurious results. Addressing the compositional structure of microbiome data is particularly critical in longitudinal studies where abundances measured at different times can correspo...

Descripción completa

Detalles Bibliográficos
Autores principales: Calle, M. Luz, Pujolassos, Meritxell, Susin, Antoni
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9990256/
https://www.ncbi.nlm.nih.gov/pubmed/36879227
http://dx.doi.org/10.1186/s12859-023-05205-3
_version_ 1784901901167362048
author Calle, M. Luz
Pujolassos, Meritxell
Susin, Antoni
author_facet Calle, M. Luz
Pujolassos, Meritxell
Susin, Antoni
author_sort Calle, M. Luz
collection PubMed
description BACKGROUND: One of the main challenges of microbiome analysis is its compositional nature that if ignored can lead to spurious results. Addressing the compositional structure of microbiome data is particularly critical in longitudinal studies where abundances measured at different times can correspond to different sub-compositions. RESULTS: We developed coda4microbiome, a new R package for analyzing microbiome data within the Compositional Data Analysis (CoDA) framework in both, cross-sectional and longitudinal studies. The aim of coda4microbiome is prediction, more specifically, the method is designed to identify a model (microbial signature) containing the minimum number of features with the maximum predictive power. The algorithm relies on the analysis of log-ratios between pairs of components and variable selection is addressed through penalized regression on the “all-pairs log-ratio model”, the model containing all possible pairwise log-ratios. For longitudinal data, the algorithm infers dynamic microbial signatures by performing penalized regression over the summary of the log-ratio trajectories (the area under these trajectories). In both, cross-sectional and longitudinal studies, the inferred microbial signature is expressed as the (weighted) balance between two groups of taxa, those that contribute positively to the microbial signature and those that contribute negatively. The package provides several graphical representations that facilitate the interpretation of the analysis and the identified microbial signatures. We illustrate the new method with data from a Crohn's disease study (cross-sectional data) and on the developing microbiome of infants (longitudinal data). CONCLUSIONS: coda4microbiome is a new algorithm for identification of microbial signatures in both, cross-sectional and longitudinal studies. The algorithm is implemented as an R package that is available at CRAN (https://cran.r-project.org/web/packages/coda4microbiome/) and is accompanied with a vignette with a detailed description of the functions. The website of the project contains several tutorials: https://malucalle.github.io/coda4microbiome/ SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-023-05205-3.
format Online
Article
Text
id pubmed-9990256
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-99902562023-03-08 coda4microbiome: compositional data analysis for microbiome cross-sectional and longitudinal studies Calle, M. Luz Pujolassos, Meritxell Susin, Antoni BMC Bioinformatics Software BACKGROUND: One of the main challenges of microbiome analysis is its compositional nature that if ignored can lead to spurious results. Addressing the compositional structure of microbiome data is particularly critical in longitudinal studies where abundances measured at different times can correspond to different sub-compositions. RESULTS: We developed coda4microbiome, a new R package for analyzing microbiome data within the Compositional Data Analysis (CoDA) framework in both, cross-sectional and longitudinal studies. The aim of coda4microbiome is prediction, more specifically, the method is designed to identify a model (microbial signature) containing the minimum number of features with the maximum predictive power. The algorithm relies on the analysis of log-ratios between pairs of components and variable selection is addressed through penalized regression on the “all-pairs log-ratio model”, the model containing all possible pairwise log-ratios. For longitudinal data, the algorithm infers dynamic microbial signatures by performing penalized regression over the summary of the log-ratio trajectories (the area under these trajectories). In both, cross-sectional and longitudinal studies, the inferred microbial signature is expressed as the (weighted) balance between two groups of taxa, those that contribute positively to the microbial signature and those that contribute negatively. The package provides several graphical representations that facilitate the interpretation of the analysis and the identified microbial signatures. We illustrate the new method with data from a Crohn's disease study (cross-sectional data) and on the developing microbiome of infants (longitudinal data). CONCLUSIONS: coda4microbiome is a new algorithm for identification of microbial signatures in both, cross-sectional and longitudinal studies. The algorithm is implemented as an R package that is available at CRAN (https://cran.r-project.org/web/packages/coda4microbiome/) and is accompanied with a vignette with a detailed description of the functions. The website of the project contains several tutorials: https://malucalle.github.io/coda4microbiome/ SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-023-05205-3. BioMed Central 2023-03-06 /pmc/articles/PMC9990256/ /pubmed/36879227 http://dx.doi.org/10.1186/s12859-023-05205-3 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Software
Calle, M. Luz
Pujolassos, Meritxell
Susin, Antoni
coda4microbiome: compositional data analysis for microbiome cross-sectional and longitudinal studies
title coda4microbiome: compositional data analysis for microbiome cross-sectional and longitudinal studies
title_full coda4microbiome: compositional data analysis for microbiome cross-sectional and longitudinal studies
title_fullStr coda4microbiome: compositional data analysis for microbiome cross-sectional and longitudinal studies
title_full_unstemmed coda4microbiome: compositional data analysis for microbiome cross-sectional and longitudinal studies
title_short coda4microbiome: compositional data analysis for microbiome cross-sectional and longitudinal studies
title_sort coda4microbiome: compositional data analysis for microbiome cross-sectional and longitudinal studies
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9990256/
https://www.ncbi.nlm.nih.gov/pubmed/36879227
http://dx.doi.org/10.1186/s12859-023-05205-3
work_keys_str_mv AT callemluz coda4microbiomecompositionaldataanalysisformicrobiomecrosssectionalandlongitudinalstudies
AT pujolassosmeritxell coda4microbiomecompositionaldataanalysisformicrobiomecrosssectionalandlongitudinalstudies
AT susinantoni coda4microbiomecompositionaldataanalysisformicrobiomecrosssectionalandlongitudinalstudies