Cargando…

MetaLonDA: a flexible R package for identifying time intervals of differentially abundant features in metagenomic longitudinal studies

BACKGROUND: Microbial longitudinal studies are powerful experimental designs utilized to classify diseases, determine prognosis, and analyze microbial systems dynamics. In longitudinal studies, only identifying differential features between two phenotypes does not provide sufficient information to d...

Descripción completa

Detalles Bibliográficos
Autores principales: Metwally, Ahmed A., Yang, Jie, Ascoli, Christian, Dai, Yang, Finn, Patricia W., Perkins, David L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5812052/
https://www.ncbi.nlm.nih.gov/pubmed/29439731
http://dx.doi.org/10.1186/s40168-018-0402-y
_version_ 1783299966761959424
author Metwally, Ahmed A.
Yang, Jie
Ascoli, Christian
Dai, Yang
Finn, Patricia W.
Perkins, David L.
author_facet Metwally, Ahmed A.
Yang, Jie
Ascoli, Christian
Dai, Yang
Finn, Patricia W.
Perkins, David L.
author_sort Metwally, Ahmed A.
collection PubMed
description BACKGROUND: Microbial longitudinal studies are powerful experimental designs utilized to classify diseases, determine prognosis, and analyze microbial systems dynamics. In longitudinal studies, only identifying differential features between two phenotypes does not provide sufficient information to determine whether a change in the relative abundance is short-term or continuous. Furthermore, sample collection in longitudinal studies suffers from all forms of variability such as a different number of subjects per phenotypic group, a different number of samples per subject, and samples not collected at consistent time points. These inconsistencies are common in studies that collect samples from human subjects. RESULTS: We present MetaLonDA, an R package that is capable of identifying significant time intervals of differentially abundant microbial features. MetaLonDA is flexible such that it can perform differential abundance tests despite inconsistencies associated with sample collection. Extensive experiments on simulated datasets quantitatively demonstrate the effectiveness of MetaLonDA with significant improvement over alternative methods. We applied MetaLonDA to the DIABIMMUNE cohort (https://pubs.broadinstitute.org/diabimmune) substantiating significant early lifetime intervals of exposure to Bacteroides and Bifidobacterium in Finnish and Russian infants. Additionally, we established significant time intervals during which novel differentially relative abundant microbial genera may contribute to aberrant immunogenicity and development of autoimmune disease. CONCLUSION: MetaLonDA is computationally efficient and can be run on desktop machines. The identified differentially abundant features and their time intervals have the potential to distinguish microbial biomarkers that may be used for microbial reconstitution through bacteriotherapy, probiotics, or antibiotics. Moreover, MetaLonDA can be applied to any longitudinal count data such as metagenomic sequencing, 16S rRNA gene sequencing, or RNAseq. MetaLonDA is publicly available on CRAN (https://CRAN.R-project.org/package=MetaLonDA). ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s40168-018-0402-y) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5812052
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-58120522018-02-15 MetaLonDA: a flexible R package for identifying time intervals of differentially abundant features in metagenomic longitudinal studies Metwally, Ahmed A. Yang, Jie Ascoli, Christian Dai, Yang Finn, Patricia W. Perkins, David L. Microbiome Software BACKGROUND: Microbial longitudinal studies are powerful experimental designs utilized to classify diseases, determine prognosis, and analyze microbial systems dynamics. In longitudinal studies, only identifying differential features between two phenotypes does not provide sufficient information to determine whether a change in the relative abundance is short-term or continuous. Furthermore, sample collection in longitudinal studies suffers from all forms of variability such as a different number of subjects per phenotypic group, a different number of samples per subject, and samples not collected at consistent time points. These inconsistencies are common in studies that collect samples from human subjects. RESULTS: We present MetaLonDA, an R package that is capable of identifying significant time intervals of differentially abundant microbial features. MetaLonDA is flexible such that it can perform differential abundance tests despite inconsistencies associated with sample collection. Extensive experiments on simulated datasets quantitatively demonstrate the effectiveness of MetaLonDA with significant improvement over alternative methods. We applied MetaLonDA to the DIABIMMUNE cohort (https://pubs.broadinstitute.org/diabimmune) substantiating significant early lifetime intervals of exposure to Bacteroides and Bifidobacterium in Finnish and Russian infants. Additionally, we established significant time intervals during which novel differentially relative abundant microbial genera may contribute to aberrant immunogenicity and development of autoimmune disease. CONCLUSION: MetaLonDA is computationally efficient and can be run on desktop machines. The identified differentially abundant features and their time intervals have the potential to distinguish microbial biomarkers that may be used for microbial reconstitution through bacteriotherapy, probiotics, or antibiotics. Moreover, MetaLonDA can be applied to any longitudinal count data such as metagenomic sequencing, 16S rRNA gene sequencing, or RNAseq. MetaLonDA is publicly available on CRAN (https://CRAN.R-project.org/package=MetaLonDA). ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s40168-018-0402-y) contains supplementary material, which is available to authorized users. BioMed Central 2018-02-13 /pmc/articles/PMC5812052/ /pubmed/29439731 http://dx.doi.org/10.1186/s40168-018-0402-y Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Metwally, Ahmed A.
Yang, Jie
Ascoli, Christian
Dai, Yang
Finn, Patricia W.
Perkins, David L.
MetaLonDA: a flexible R package for identifying time intervals of differentially abundant features in metagenomic longitudinal studies
title MetaLonDA: a flexible R package for identifying time intervals of differentially abundant features in metagenomic longitudinal studies
title_full MetaLonDA: a flexible R package for identifying time intervals of differentially abundant features in metagenomic longitudinal studies
title_fullStr MetaLonDA: a flexible R package for identifying time intervals of differentially abundant features in metagenomic longitudinal studies
title_full_unstemmed MetaLonDA: a flexible R package for identifying time intervals of differentially abundant features in metagenomic longitudinal studies
title_short MetaLonDA: a flexible R package for identifying time intervals of differentially abundant features in metagenomic longitudinal studies
title_sort metalonda: a flexible r package for identifying time intervals of differentially abundant features in metagenomic longitudinal studies
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5812052/
https://www.ncbi.nlm.nih.gov/pubmed/29439731
http://dx.doi.org/10.1186/s40168-018-0402-y
work_keys_str_mv AT metwallyahmeda metalondaaflexiblerpackageforidentifyingtimeintervalsofdifferentiallyabundantfeaturesinmetagenomiclongitudinalstudies
AT yangjie metalondaaflexiblerpackageforidentifyingtimeintervalsofdifferentiallyabundantfeaturesinmetagenomiclongitudinalstudies
AT ascolichristian metalondaaflexiblerpackageforidentifyingtimeintervalsofdifferentiallyabundantfeaturesinmetagenomiclongitudinalstudies
AT daiyang metalondaaflexiblerpackageforidentifyingtimeintervalsofdifferentiallyabundantfeaturesinmetagenomiclongitudinalstudies
AT finnpatriciaw metalondaaflexiblerpackageforidentifyingtimeintervalsofdifferentiallyabundantfeaturesinmetagenomiclongitudinalstudies
AT perkinsdavidl metalondaaflexiblerpackageforidentifyingtimeintervalsofdifferentiallyabundantfeaturesinmetagenomiclongitudinalstudies