Cargando…

Information Theoretic Metagenome Assembly Allows the Discovery of Disease Biomarkers in Human Microbiome

Quantitative metagenomics is an important field that has delivered successful microbiome biomarkers associated with host phenotypes. The current convention mainly depends on unsupervised assembly of metagenomic contigs with a possibility of leaving interesting genetic material unassembled. Additiona...

Descripción completa

Detalles Bibliográficos
Autor principal: Nalbantoglu, O. Ufuk
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7913240/
https://www.ncbi.nlm.nih.gov/pubmed/33540903
http://dx.doi.org/10.3390/e23020187
Descripción
Sumario:Quantitative metagenomics is an important field that has delivered successful microbiome biomarkers associated with host phenotypes. The current convention mainly depends on unsupervised assembly of metagenomic contigs with a possibility of leaving interesting genetic material unassembled. Additionally, biomarkers are commonly defined on the differential relative abundance of compositional or functional units. Accumulating evidence supports that microbial genetic variations are as important as the differential abundance content, implying the need for novel methods accounting for the genetic variations in metagenomics studies. We propose an information theoretic metagenome assembly algorithm, discovering genomic fragments with maximal self-information, defined by the empirical distributions of nucleotides across the phenotypes and quantified with the help of statistical tests. Our algorithm infers fragments populating the most informative genetic variants in a single contig, named supervariant fragments. Experiments on simulated metagenomes, as well as on a colorectal cancer and an atherosclerotic cardiovascular disease dataset consistently discovered sequences strongly associated with the disease phenotypes. Moreover, the discriminatory power of these putative biomarkers was mainly attributed to the genetic variations rather than relative abundance. Our results support that a focus on metagenomics methods considering microbiome population genetics might be useful in discovering disease biomarkers with a great potential of translating to molecular diagnostics and biotherapeutics applications.