Cargando…

Information Theoretic Metagenome Assembly Allows the Discovery of Disease Biomarkers in Human Microbiome

Quantitative metagenomics is an important field that has delivered successful microbiome biomarkers associated with host phenotypes. The current convention mainly depends on unsupervised assembly of metagenomic contigs with a possibility of leaving interesting genetic material unassembled. Additiona...

Descripción completa

Detalles Bibliográficos
Autor principal: Nalbantoglu, O. Ufuk
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7913240/
https://www.ncbi.nlm.nih.gov/pubmed/33540903
http://dx.doi.org/10.3390/e23020187
_version_ 1783656759833919488
author Nalbantoglu, O. Ufuk
author_facet Nalbantoglu, O. Ufuk
author_sort Nalbantoglu, O. Ufuk
collection PubMed
description Quantitative metagenomics is an important field that has delivered successful microbiome biomarkers associated with host phenotypes. The current convention mainly depends on unsupervised assembly of metagenomic contigs with a possibility of leaving interesting genetic material unassembled. Additionally, biomarkers are commonly defined on the differential relative abundance of compositional or functional units. Accumulating evidence supports that microbial genetic variations are as important as the differential abundance content, implying the need for novel methods accounting for the genetic variations in metagenomics studies. We propose an information theoretic metagenome assembly algorithm, discovering genomic fragments with maximal self-information, defined by the empirical distributions of nucleotides across the phenotypes and quantified with the help of statistical tests. Our algorithm infers fragments populating the most informative genetic variants in a single contig, named supervariant fragments. Experiments on simulated metagenomes, as well as on a colorectal cancer and an atherosclerotic cardiovascular disease dataset consistently discovered sequences strongly associated with the disease phenotypes. Moreover, the discriminatory power of these putative biomarkers was mainly attributed to the genetic variations rather than relative abundance. Our results support that a focus on metagenomics methods considering microbiome population genetics might be useful in discovering disease biomarkers with a great potential of translating to molecular diagnostics and biotherapeutics applications.
format Online
Article
Text
id pubmed-7913240
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-79132402021-02-28 Information Theoretic Metagenome Assembly Allows the Discovery of Disease Biomarkers in Human Microbiome Nalbantoglu, O. Ufuk Entropy (Basel) Article Quantitative metagenomics is an important field that has delivered successful microbiome biomarkers associated with host phenotypes. The current convention mainly depends on unsupervised assembly of metagenomic contigs with a possibility of leaving interesting genetic material unassembled. Additionally, biomarkers are commonly defined on the differential relative abundance of compositional or functional units. Accumulating evidence supports that microbial genetic variations are as important as the differential abundance content, implying the need for novel methods accounting for the genetic variations in metagenomics studies. We propose an information theoretic metagenome assembly algorithm, discovering genomic fragments with maximal self-information, defined by the empirical distributions of nucleotides across the phenotypes and quantified with the help of statistical tests. Our algorithm infers fragments populating the most informative genetic variants in a single contig, named supervariant fragments. Experiments on simulated metagenomes, as well as on a colorectal cancer and an atherosclerotic cardiovascular disease dataset consistently discovered sequences strongly associated with the disease phenotypes. Moreover, the discriminatory power of these putative biomarkers was mainly attributed to the genetic variations rather than relative abundance. Our results support that a focus on metagenomics methods considering microbiome population genetics might be useful in discovering disease biomarkers with a great potential of translating to molecular diagnostics and biotherapeutics applications. MDPI 2021-02-02 /pmc/articles/PMC7913240/ /pubmed/33540903 http://dx.doi.org/10.3390/e23020187 Text en © 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Nalbantoglu, O. Ufuk
Information Theoretic Metagenome Assembly Allows the Discovery of Disease Biomarkers in Human Microbiome
title Information Theoretic Metagenome Assembly Allows the Discovery of Disease Biomarkers in Human Microbiome
title_full Information Theoretic Metagenome Assembly Allows the Discovery of Disease Biomarkers in Human Microbiome
title_fullStr Information Theoretic Metagenome Assembly Allows the Discovery of Disease Biomarkers in Human Microbiome
title_full_unstemmed Information Theoretic Metagenome Assembly Allows the Discovery of Disease Biomarkers in Human Microbiome
title_short Information Theoretic Metagenome Assembly Allows the Discovery of Disease Biomarkers in Human Microbiome
title_sort information theoretic metagenome assembly allows the discovery of disease biomarkers in human microbiome
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7913240/
https://www.ncbi.nlm.nih.gov/pubmed/33540903
http://dx.doi.org/10.3390/e23020187
work_keys_str_mv AT nalbantogluoufuk informationtheoreticmetagenomeassemblyallowsthediscoveryofdiseasebiomarkersinhumanmicrobiome