Cargando…

A zero inflated log-normal model for inference of sparse microbial association networks

The advent of high-throughput metagenomic sequencing has prompted the development of efficient taxonomic profiling methods allowing to measure the presence, abundance and phylogeny of organisms in a wide range of environmental samples. Multivariate sequence-derived abundance data further has the pot...

Descripción completa

Detalles Bibliográficos
Autores principales: Prost, Vincent, Gazut, Stéphane, Brüls, Thomas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8244920/
https://www.ncbi.nlm.nih.gov/pubmed/34143768
http://dx.doi.org/10.1371/journal.pcbi.1009089
_version_ 1783716022493118464
author Prost, Vincent
Gazut, Stéphane
Brüls, Thomas
author_facet Prost, Vincent
Gazut, Stéphane
Brüls, Thomas
author_sort Prost, Vincent
collection PubMed
description The advent of high-throughput metagenomic sequencing has prompted the development of efficient taxonomic profiling methods allowing to measure the presence, abundance and phylogeny of organisms in a wide range of environmental samples. Multivariate sequence-derived abundance data further has the potential to enable inference of ecological associations between microbial populations, but several technical issues need to be accounted for, like the compositional nature of the data, its extreme sparsity and overdispersion, as well as the frequent need to operate in under-determined regimes. The ecological network reconstruction problem is frequently cast into the paradigm of Gaussian Graphical Models (GGMs) for which efficient structure inference algorithms are available, like the graphical lasso and neighborhood selection. Unfortunately, GGMs or variants thereof can not properly account for the extremely sparse patterns occurring in real-world metagenomic taxonomic profiles. In particular, structural zeros (as opposed to sampling zeros) corresponding to true absences of biological signals fail to be properly handled by most statistical methods. We present here a zero-inflated log-normal graphical model (available at https://github.com/vincentprost/Zi-LN) specifically aimed at handling such “biological” zeros, and demonstrate significant performance gains over state-of-the-art statistical methods for the inference of microbial association networks, with most notable gains obtained when analyzing taxonomic profiles displaying sparsity levels on par with real-world metagenomic datasets.
format Online
Article
Text
id pubmed-8244920
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-82449202021-07-12 A zero inflated log-normal model for inference of sparse microbial association networks Prost, Vincent Gazut, Stéphane Brüls, Thomas PLoS Comput Biol Research Article The advent of high-throughput metagenomic sequencing has prompted the development of efficient taxonomic profiling methods allowing to measure the presence, abundance and phylogeny of organisms in a wide range of environmental samples. Multivariate sequence-derived abundance data further has the potential to enable inference of ecological associations between microbial populations, but several technical issues need to be accounted for, like the compositional nature of the data, its extreme sparsity and overdispersion, as well as the frequent need to operate in under-determined regimes. The ecological network reconstruction problem is frequently cast into the paradigm of Gaussian Graphical Models (GGMs) for which efficient structure inference algorithms are available, like the graphical lasso and neighborhood selection. Unfortunately, GGMs or variants thereof can not properly account for the extremely sparse patterns occurring in real-world metagenomic taxonomic profiles. In particular, structural zeros (as opposed to sampling zeros) corresponding to true absences of biological signals fail to be properly handled by most statistical methods. We present here a zero-inflated log-normal graphical model (available at https://github.com/vincentprost/Zi-LN) specifically aimed at handling such “biological” zeros, and demonstrate significant performance gains over state-of-the-art statistical methods for the inference of microbial association networks, with most notable gains obtained when analyzing taxonomic profiles displaying sparsity levels on par with real-world metagenomic datasets. Public Library of Science 2021-06-18 /pmc/articles/PMC8244920/ /pubmed/34143768 http://dx.doi.org/10.1371/journal.pcbi.1009089 Text en © 2021 Prost et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Prost, Vincent
Gazut, Stéphane
Brüls, Thomas
A zero inflated log-normal model for inference of sparse microbial association networks
title A zero inflated log-normal model for inference of sparse microbial association networks
title_full A zero inflated log-normal model for inference of sparse microbial association networks
title_fullStr A zero inflated log-normal model for inference of sparse microbial association networks
title_full_unstemmed A zero inflated log-normal model for inference of sparse microbial association networks
title_short A zero inflated log-normal model for inference of sparse microbial association networks
title_sort zero inflated log-normal model for inference of sparse microbial association networks
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8244920/
https://www.ncbi.nlm.nih.gov/pubmed/34143768
http://dx.doi.org/10.1371/journal.pcbi.1009089
work_keys_str_mv AT prostvincent azeroinflatedlognormalmodelforinferenceofsparsemicrobialassociationnetworks
AT gazutstephane azeroinflatedlognormalmodelforinferenceofsparsemicrobialassociationnetworks
AT brulsthomas azeroinflatedlognormalmodelforinferenceofsparsemicrobialassociationnetworks
AT prostvincent zeroinflatedlognormalmodelforinferenceofsparsemicrobialassociationnetworks
AT gazutstephane zeroinflatedlognormalmodelforinferenceofsparsemicrobialassociationnetworks
AT brulsthomas zeroinflatedlognormalmodelforinferenceofsparsemicrobialassociationnetworks