Cargando…

The selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection

Shotgun metagenomic sequencing analysis is widely used for microbial profiling of biological specimens and pathogen detection. However, very little is known about the technical biases caused by the choice of analysis software and databases on the biological specimen. In this study, we evaluated diff...

Descripción completa

Detalles Bibliográficos
Autores principales: Xu, Ruijie, Rajeev, Sreekumari, Salvador, Liliana C. M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10081788/
https://www.ncbi.nlm.nih.gov/pubmed/37027361
http://dx.doi.org/10.1371/journal.pone.0284031
_version_ 1785021190354501632
author Xu, Ruijie
Rajeev, Sreekumari
Salvador, Liliana C. M.
author_facet Xu, Ruijie
Rajeev, Sreekumari
Salvador, Liliana C. M.
author_sort Xu, Ruijie
collection PubMed
description Shotgun metagenomic sequencing analysis is widely used for microbial profiling of biological specimens and pathogen detection. However, very little is known about the technical biases caused by the choice of analysis software and databases on the biological specimen. In this study, we evaluated different direct read shotgun metagenomics taxonomic profiling software to characterize the microbial compositions of simulated mice gut microbiome samples and of biological samples collected from wild rodents across multiple taxonomic levels. Using ten of the most widely used metagenomics software and four different databases, we demonstrated that obtaining an accurate species-level microbial profile using the current direct read metagenomics profiling software is still a challenging task. We also showed that the discrepancies in results when different databases and software were used could lead to significant variations in the distinct microbial taxa classified, in the characterizations of the microbial communities, and in the differentially abundant taxa identified. Differences in database contents and read profiling algorithms are the main contributors for these discrepancies. The inclusion of host genomes and of genomes of the interested taxa in the databases is important for increasing the accuracy of profiling. Our analysis also showed that software included in this study differed in their ability to detect the presence of Leptospira, a major zoonotic pathogen of one health importance, especially at the species level resolution. We concluded that using different databases and software combinations can result in confounding biological conclusions in microbial profiling. Our study warrants that software and database selection must be based on the purpose of the study.
format Online
Article
Text
id pubmed-10081788
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-100817882023-04-08 The selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection Xu, Ruijie Rajeev, Sreekumari Salvador, Liliana C. M. PLoS One Research Article Shotgun metagenomic sequencing analysis is widely used for microbial profiling of biological specimens and pathogen detection. However, very little is known about the technical biases caused by the choice of analysis software and databases on the biological specimen. In this study, we evaluated different direct read shotgun metagenomics taxonomic profiling software to characterize the microbial compositions of simulated mice gut microbiome samples and of biological samples collected from wild rodents across multiple taxonomic levels. Using ten of the most widely used metagenomics software and four different databases, we demonstrated that obtaining an accurate species-level microbial profile using the current direct read metagenomics profiling software is still a challenging task. We also showed that the discrepancies in results when different databases and software were used could lead to significant variations in the distinct microbial taxa classified, in the characterizations of the microbial communities, and in the differentially abundant taxa identified. Differences in database contents and read profiling algorithms are the main contributors for these discrepancies. The inclusion of host genomes and of genomes of the interested taxa in the databases is important for increasing the accuracy of profiling. Our analysis also showed that software included in this study differed in their ability to detect the presence of Leptospira, a major zoonotic pathogen of one health importance, especially at the species level resolution. We concluded that using different databases and software combinations can result in confounding biological conclusions in microbial profiling. Our study warrants that software and database selection must be based on the purpose of the study. Public Library of Science 2023-04-07 /pmc/articles/PMC10081788/ /pubmed/37027361 http://dx.doi.org/10.1371/journal.pone.0284031 Text en © 2023 Xu et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Xu, Ruijie
Rajeev, Sreekumari
Salvador, Liliana C. M.
The selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection
title The selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection
title_full The selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection
title_fullStr The selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection
title_full_unstemmed The selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection
title_short The selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection
title_sort selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10081788/
https://www.ncbi.nlm.nih.gov/pubmed/37027361
http://dx.doi.org/10.1371/journal.pone.0284031
work_keys_str_mv AT xuruijie theselectionofsoftwareanddatabaseformetagenomicssequenceanalysisimpactstheoutcomeofmicrobialprofilingandpathogendetection
AT rajeevsreekumari theselectionofsoftwareanddatabaseformetagenomicssequenceanalysisimpactstheoutcomeofmicrobialprofilingandpathogendetection
AT salvadorlilianacm theselectionofsoftwareanddatabaseformetagenomicssequenceanalysisimpactstheoutcomeofmicrobialprofilingandpathogendetection
AT xuruijie selectionofsoftwareanddatabaseformetagenomicssequenceanalysisimpactstheoutcomeofmicrobialprofilingandpathogendetection
AT rajeevsreekumari selectionofsoftwareanddatabaseformetagenomicssequenceanalysisimpactstheoutcomeofmicrobialprofilingandpathogendetection
AT salvadorlilianacm selectionofsoftwareanddatabaseformetagenomicssequenceanalysisimpactstheoutcomeofmicrobialprofilingandpathogendetection