Cargando…
The selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection
Shotgun metagenomic sequencing analysis is widely used for microbial profiling of biological specimens and pathogen detection. However, very little is known about the technical biases caused by the choice of analysis software and databases on the biological specimen. In this study, we evaluated diff...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10081788/ https://www.ncbi.nlm.nih.gov/pubmed/37027361 http://dx.doi.org/10.1371/journal.pone.0284031 |
_version_ | 1785021190354501632 |
---|---|
author | Xu, Ruijie Rajeev, Sreekumari Salvador, Liliana C. M. |
author_facet | Xu, Ruijie Rajeev, Sreekumari Salvador, Liliana C. M. |
author_sort | Xu, Ruijie |
collection | PubMed |
description | Shotgun metagenomic sequencing analysis is widely used for microbial profiling of biological specimens and pathogen detection. However, very little is known about the technical biases caused by the choice of analysis software and databases on the biological specimen. In this study, we evaluated different direct read shotgun metagenomics taxonomic profiling software to characterize the microbial compositions of simulated mice gut microbiome samples and of biological samples collected from wild rodents across multiple taxonomic levels. Using ten of the most widely used metagenomics software and four different databases, we demonstrated that obtaining an accurate species-level microbial profile using the current direct read metagenomics profiling software is still a challenging task. We also showed that the discrepancies in results when different databases and software were used could lead to significant variations in the distinct microbial taxa classified, in the characterizations of the microbial communities, and in the differentially abundant taxa identified. Differences in database contents and read profiling algorithms are the main contributors for these discrepancies. The inclusion of host genomes and of genomes of the interested taxa in the databases is important for increasing the accuracy of profiling. Our analysis also showed that software included in this study differed in their ability to detect the presence of Leptospira, a major zoonotic pathogen of one health importance, especially at the species level resolution. We concluded that using different databases and software combinations can result in confounding biological conclusions in microbial profiling. Our study warrants that software and database selection must be based on the purpose of the study. |
format | Online Article Text |
id | pubmed-10081788 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-100817882023-04-08 The selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection Xu, Ruijie Rajeev, Sreekumari Salvador, Liliana C. M. PLoS One Research Article Shotgun metagenomic sequencing analysis is widely used for microbial profiling of biological specimens and pathogen detection. However, very little is known about the technical biases caused by the choice of analysis software and databases on the biological specimen. In this study, we evaluated different direct read shotgun metagenomics taxonomic profiling software to characterize the microbial compositions of simulated mice gut microbiome samples and of biological samples collected from wild rodents across multiple taxonomic levels. Using ten of the most widely used metagenomics software and four different databases, we demonstrated that obtaining an accurate species-level microbial profile using the current direct read metagenomics profiling software is still a challenging task. We also showed that the discrepancies in results when different databases and software were used could lead to significant variations in the distinct microbial taxa classified, in the characterizations of the microbial communities, and in the differentially abundant taxa identified. Differences in database contents and read profiling algorithms are the main contributors for these discrepancies. The inclusion of host genomes and of genomes of the interested taxa in the databases is important for increasing the accuracy of profiling. Our analysis also showed that software included in this study differed in their ability to detect the presence of Leptospira, a major zoonotic pathogen of one health importance, especially at the species level resolution. We concluded that using different databases and software combinations can result in confounding biological conclusions in microbial profiling. Our study warrants that software and database selection must be based on the purpose of the study. Public Library of Science 2023-04-07 /pmc/articles/PMC10081788/ /pubmed/37027361 http://dx.doi.org/10.1371/journal.pone.0284031 Text en © 2023 Xu et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Xu, Ruijie Rajeev, Sreekumari Salvador, Liliana C. M. The selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection |
title | The selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection |
title_full | The selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection |
title_fullStr | The selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection |
title_full_unstemmed | The selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection |
title_short | The selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection |
title_sort | selection of software and database for metagenomics sequence analysis impacts the outcome of microbial profiling and pathogen detection |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10081788/ https://www.ncbi.nlm.nih.gov/pubmed/37027361 http://dx.doi.org/10.1371/journal.pone.0284031 |
work_keys_str_mv | AT xuruijie theselectionofsoftwareanddatabaseformetagenomicssequenceanalysisimpactstheoutcomeofmicrobialprofilingandpathogendetection AT rajeevsreekumari theselectionofsoftwareanddatabaseformetagenomicssequenceanalysisimpactstheoutcomeofmicrobialprofilingandpathogendetection AT salvadorlilianacm theselectionofsoftwareanddatabaseformetagenomicssequenceanalysisimpactstheoutcomeofmicrobialprofilingandpathogendetection AT xuruijie selectionofsoftwareanddatabaseformetagenomicssequenceanalysisimpactstheoutcomeofmicrobialprofilingandpathogendetection AT rajeevsreekumari selectionofsoftwareanddatabaseformetagenomicssequenceanalysisimpactstheoutcomeofmicrobialprofilingandpathogendetection AT salvadorlilianacm selectionofsoftwareanddatabaseformetagenomicssequenceanalysisimpactstheoutcomeofmicrobialprofilingandpathogendetection |