Cargando…
SEQ2MGS: an effective tool for generating realistic artificial metagenomes from the existing sequencing data
Assessment of bioinformatics tools for the metagenomics analysis from the whole genome sequencing data requires realistic benchmark sets. We developed an effective and simple generator of artificial metagenomes from real sequencing experiments. The tool (SEQ2MGS) analyzes the input FASTQ files, prec...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9310082/ https://www.ncbi.nlm.nih.gov/pubmed/35899079 http://dx.doi.org/10.1093/nargab/lqac050 |
_version_ | 1784753313959378944 |
---|---|
author | Van Camp, Pieter-Jan Porollo, Aleksey |
author_facet | Van Camp, Pieter-Jan Porollo, Aleksey |
author_sort | Van Camp, Pieter-Jan |
collection | PubMed |
description | Assessment of bioinformatics tools for the metagenomics analysis from the whole genome sequencing data requires realistic benchmark sets. We developed an effective and simple generator of artificial metagenomes from real sequencing experiments. The tool (SEQ2MGS) analyzes the input FASTQ files, precomputes genomic content, and blends shotgun reads from different sequenced isolates, or spike isolate(s) in real metagenome, in desired proportions. SEQ2MGS eliminates the need for simulation of sequencing platform variations, reads distributions, presence of plasmids, viruses, and contamination. The tool is especially useful for a quick generation of multiple complex samples that include new or understudied organisms, even without assembled genomes. For illustration, we first demonstrated the ease of SEQ2MGS use for the simulation of altered Schaedler flora (ASF) in comparison with de novo metagenomics generators Grinder and CAMISIM. Next, we emulated the emergence of a pathogen in the human gut microbiome and observed that Kraken, Centrifuge, and MetaPhlAn, while correctly identified Klebsiella pneumoniae, produced inconsistent results for the rest of real metagenome. Finally, using the MG-RAST platform, we affirmed that SEQ2MGS properly transfers genomic information from an isolate into the simulated metagenome by the correct identification of antimicrobial resistance genes anticipated to appear compared to the original metagenome. |
format | Online Article Text |
id | pubmed-9310082 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-93100822022-07-26 SEQ2MGS: an effective tool for generating realistic artificial metagenomes from the existing sequencing data Van Camp, Pieter-Jan Porollo, Aleksey NAR Genom Bioinform Methods Article Assessment of bioinformatics tools for the metagenomics analysis from the whole genome sequencing data requires realistic benchmark sets. We developed an effective and simple generator of artificial metagenomes from real sequencing experiments. The tool (SEQ2MGS) analyzes the input FASTQ files, precomputes genomic content, and blends shotgun reads from different sequenced isolates, or spike isolate(s) in real metagenome, in desired proportions. SEQ2MGS eliminates the need for simulation of sequencing platform variations, reads distributions, presence of plasmids, viruses, and contamination. The tool is especially useful for a quick generation of multiple complex samples that include new or understudied organisms, even without assembled genomes. For illustration, we first demonstrated the ease of SEQ2MGS use for the simulation of altered Schaedler flora (ASF) in comparison with de novo metagenomics generators Grinder and CAMISIM. Next, we emulated the emergence of a pathogen in the human gut microbiome and observed that Kraken, Centrifuge, and MetaPhlAn, while correctly identified Klebsiella pneumoniae, produced inconsistent results for the rest of real metagenome. Finally, using the MG-RAST platform, we affirmed that SEQ2MGS properly transfers genomic information from an isolate into the simulated metagenome by the correct identification of antimicrobial resistance genes anticipated to appear compared to the original metagenome. Oxford University Press 2022-07-25 /pmc/articles/PMC9310082/ /pubmed/35899079 http://dx.doi.org/10.1093/nargab/lqac050 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methods Article Van Camp, Pieter-Jan Porollo, Aleksey SEQ2MGS: an effective tool for generating realistic artificial metagenomes from the existing sequencing data |
title | SEQ2MGS: an effective tool for generating realistic artificial metagenomes from the existing sequencing data |
title_full | SEQ2MGS: an effective tool for generating realistic artificial metagenomes from the existing sequencing data |
title_fullStr | SEQ2MGS: an effective tool for generating realistic artificial metagenomes from the existing sequencing data |
title_full_unstemmed | SEQ2MGS: an effective tool for generating realistic artificial metagenomes from the existing sequencing data |
title_short | SEQ2MGS: an effective tool for generating realistic artificial metagenomes from the existing sequencing data |
title_sort | seq2mgs: an effective tool for generating realistic artificial metagenomes from the existing sequencing data |
topic | Methods Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9310082/ https://www.ncbi.nlm.nih.gov/pubmed/35899079 http://dx.doi.org/10.1093/nargab/lqac050 |
work_keys_str_mv | AT vancamppieterjan seq2mgsaneffectivetoolforgeneratingrealisticartificialmetagenomesfromtheexistingsequencingdata AT porolloaleksey seq2mgsaneffectivetoolforgeneratingrealisticartificialmetagenomesfromtheexistingsequencingdata |