Cargando…

Comparison of k-mer-based de novo comparative metagenomic tools and approaches

Aim: Comparative metagenomic analysis requires measuring a pairwise similarity between metagenomes in the dataset. Reference-based methods that compute a beta-diversity distance between two metagenomes are highly dependent on the quality and completeness of the reference database, and their applicat...

Descripción completa

Detalles Bibliográficos
Autores principales: Ponsero, Alise Jany, Miller, Matthew, Hurwitz, Bonnie Louise
Formato: Online Artículo Texto
Lenguaje:English
Publicado: OAE Publishing Inc. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10696585/
http://dx.doi.org/10.20517/mrr.2023.26
_version_ 1785154599979581440
author Ponsero, Alise Jany
Miller, Matthew
Hurwitz, Bonnie Louise
author_facet Ponsero, Alise Jany
Miller, Matthew
Hurwitz, Bonnie Louise
author_sort Ponsero, Alise Jany
collection PubMed
description Aim: Comparative metagenomic analysis requires measuring a pairwise similarity between metagenomes in the dataset. Reference-based methods that compute a beta-diversity distance between two metagenomes are highly dependent on the quality and completeness of the reference database, and their application on less studied microbiota can be challenging. On the other hand, de-novo comparative metagenomic methods only rely on the sequence composition of metagenomes to compare datasets. While each one of these approaches has its strengths and limitations, their comparison is currently limited. Methods: We developed sets of simulated short-reads metagenomes to (1) compare k-mer-based and taxonomy-based distances and evaluate the impact of technical and biological variables on these metrics and (2) evaluate the effect of k-mer sketching and filtering. We used a real-world metagenomic dataset to provide an overview of the currently available tools for de novo metagenomic comparative analysis. Results: Using simulated metagenomes of known composition and controlled error rate, we showed that k-mer-based distance metrics were well correlated to the taxonomic distance metric for quantitative Beta-diversity metrics, but the correlation was low for presence/absence distances. The community complexity in terms of taxa richness and the sequencing depth significantly affected the quality of the k-mer-based distances, while the impact of low amounts of sequence contamination and sequencing error was limited. Finally, we benchmarked currently available de-novo comparative metagenomic tools and compared their output on two datasets of fecal metagenomes and showed that most k-mer-based tools were able to recapitulate the data structure observed using taxonomic approaches. Conclusion: This study expands our understanding of the strength and limitations of k-mer-based de novo comparative metagenomic approaches and aims to provide concrete guidelines for researchers interested in applying these approaches to their metagenomic datasets.
format Online
Article
Text
id pubmed-10696585
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher OAE Publishing Inc.
record_format MEDLINE/PubMed
spelling pubmed-106965852023-12-06 Comparison of k-mer-based de novo comparative metagenomic tools and approaches Ponsero, Alise Jany Miller, Matthew Hurwitz, Bonnie Louise Microbiome Res Rep Original Article Aim: Comparative metagenomic analysis requires measuring a pairwise similarity between metagenomes in the dataset. Reference-based methods that compute a beta-diversity distance between two metagenomes are highly dependent on the quality and completeness of the reference database, and their application on less studied microbiota can be challenging. On the other hand, de-novo comparative metagenomic methods only rely on the sequence composition of metagenomes to compare datasets. While each one of these approaches has its strengths and limitations, their comparison is currently limited. Methods: We developed sets of simulated short-reads metagenomes to (1) compare k-mer-based and taxonomy-based distances and evaluate the impact of technical and biological variables on these metrics and (2) evaluate the effect of k-mer sketching and filtering. We used a real-world metagenomic dataset to provide an overview of the currently available tools for de novo metagenomic comparative analysis. Results: Using simulated metagenomes of known composition and controlled error rate, we showed that k-mer-based distance metrics were well correlated to the taxonomic distance metric for quantitative Beta-diversity metrics, but the correlation was low for presence/absence distances. The community complexity in terms of taxa richness and the sequencing depth significantly affected the quality of the k-mer-based distances, while the impact of low amounts of sequence contamination and sequencing error was limited. Finally, we benchmarked currently available de-novo comparative metagenomic tools and compared their output on two datasets of fecal metagenomes and showed that most k-mer-based tools were able to recapitulate the data structure observed using taxonomic approaches. Conclusion: This study expands our understanding of the strength and limitations of k-mer-based de novo comparative metagenomic approaches and aims to provide concrete guidelines for researchers interested in applying these approaches to their metagenomic datasets. OAE Publishing Inc. 2023-07-20 /pmc/articles/PMC10696585/ http://dx.doi.org/10.20517/mrr.2023.26 Text en © The Author(s) 2023. https://creativecommons.org/licenses/by/4.0/© The Author(s) 2023. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, sharing, adaptation, distribution and reproduction in any medium or format, for any purpose, even commercially, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
spellingShingle Original Article
Ponsero, Alise Jany
Miller, Matthew
Hurwitz, Bonnie Louise
Comparison of k-mer-based de novo comparative metagenomic tools and approaches
title Comparison of k-mer-based de novo comparative metagenomic tools and approaches
title_full Comparison of k-mer-based de novo comparative metagenomic tools and approaches
title_fullStr Comparison of k-mer-based de novo comparative metagenomic tools and approaches
title_full_unstemmed Comparison of k-mer-based de novo comparative metagenomic tools and approaches
title_short Comparison of k-mer-based de novo comparative metagenomic tools and approaches
title_sort comparison of k-mer-based de novo comparative metagenomic tools and approaches
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10696585/
http://dx.doi.org/10.20517/mrr.2023.26
work_keys_str_mv AT ponseroalisejany comparisonofkmerbaseddenovocomparativemetagenomictoolsandapproaches
AT millermatthew comparisonofkmerbaseddenovocomparativemetagenomictoolsandapproaches
AT hurwitzbonnielouise comparisonofkmerbaseddenovocomparativemetagenomictoolsandapproaches