Cargando…
Characterization and simulation of metagenomic nanopore sequencing data with Meta-NanoSim
BACKGROUND: Nanopore sequencing is crucial to metagenomic studies as its kilobase-long reads can contribute to resolving genomic structural differences among microbes. However, sequencing platform-specific challenges, including high base-call error rate, nonuniform read lengths, and the presence of...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10025935/ https://www.ncbi.nlm.nih.gov/pubmed/36939007 http://dx.doi.org/10.1093/gigascience/giad013 |
_version_ | 1784909440201261056 |
---|---|
author | Yang, Chen Lo, Theodora Nip, Ka Ming Hafezqorani, Saber Warren, René L Birol, Inanc |
author_facet | Yang, Chen Lo, Theodora Nip, Ka Ming Hafezqorani, Saber Warren, René L Birol, Inanc |
author_sort | Yang, Chen |
collection | PubMed |
description | BACKGROUND: Nanopore sequencing is crucial to metagenomic studies as its kilobase-long reads can contribute to resolving genomic structural differences among microbes. However, sequencing platform-specific challenges, including high base-call error rate, nonuniform read lengths, and the presence of chimeric artifacts, necessitate specifically designed analytical algorithms. The use of simulated datasets with characteristics that are true to the sequencing platform under evaluation is a cost-effective way to assess the performance of bioinformatics tools with the ground truth in a controlled environment. RESULTS: Here, we present Meta-NanoSim, a fast and versatile utility that characterizes and simulates the unique properties of nanopore metagenomic reads. It improves upon state-of-the-art methods on microbial abundance estimation through a base-level quantification algorithm. Meta-NanoSim can simulate complex microbial communities composed of both linear and circular genomes and can stream reference genomes from online servers directly. Simulated datasets showed high congruence with experimental data in terms of read length, error profiles, and abundance levels. We demonstrate that Meta-NanoSim simulated data can facilitate the development of metagenomic algorithms and guide experimental design through a metagenome assembly benchmarking task. CONCLUSIONS: The Meta-NanoSim characterization module investigates read features, including chimeric information and abundance levels, while the simulation module simulates large and complex multisample microbial communities with different abundance profiles. All trained models and the software are freely accessible at GitHub: https://github.com/bcgsc/NanoSim. |
format | Online Article Text |
id | pubmed-10025935 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-100259352023-03-21 Characterization and simulation of metagenomic nanopore sequencing data with Meta-NanoSim Yang, Chen Lo, Theodora Nip, Ka Ming Hafezqorani, Saber Warren, René L Birol, Inanc Gigascience Technical Note BACKGROUND: Nanopore sequencing is crucial to metagenomic studies as its kilobase-long reads can contribute to resolving genomic structural differences among microbes. However, sequencing platform-specific challenges, including high base-call error rate, nonuniform read lengths, and the presence of chimeric artifacts, necessitate specifically designed analytical algorithms. The use of simulated datasets with characteristics that are true to the sequencing platform under evaluation is a cost-effective way to assess the performance of bioinformatics tools with the ground truth in a controlled environment. RESULTS: Here, we present Meta-NanoSim, a fast and versatile utility that characterizes and simulates the unique properties of nanopore metagenomic reads. It improves upon state-of-the-art methods on microbial abundance estimation through a base-level quantification algorithm. Meta-NanoSim can simulate complex microbial communities composed of both linear and circular genomes and can stream reference genomes from online servers directly. Simulated datasets showed high congruence with experimental data in terms of read length, error profiles, and abundance levels. We demonstrate that Meta-NanoSim simulated data can facilitate the development of metagenomic algorithms and guide experimental design through a metagenome assembly benchmarking task. CONCLUSIONS: The Meta-NanoSim characterization module investigates read features, including chimeric information and abundance levels, while the simulation module simulates large and complex multisample microbial communities with different abundance profiles. All trained models and the software are freely accessible at GitHub: https://github.com/bcgsc/NanoSim. Oxford University Press 2023-03-20 /pmc/articles/PMC10025935/ /pubmed/36939007 http://dx.doi.org/10.1093/gigascience/giad013 Text en © The Author(s) 2023. Published by Oxford University Press GigaScience. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Technical Note Yang, Chen Lo, Theodora Nip, Ka Ming Hafezqorani, Saber Warren, René L Birol, Inanc Characterization and simulation of metagenomic nanopore sequencing data with Meta-NanoSim |
title | Characterization and simulation of metagenomic nanopore sequencing data with Meta-NanoSim |
title_full | Characterization and simulation of metagenomic nanopore sequencing data with Meta-NanoSim |
title_fullStr | Characterization and simulation of metagenomic nanopore sequencing data with Meta-NanoSim |
title_full_unstemmed | Characterization and simulation of metagenomic nanopore sequencing data with Meta-NanoSim |
title_short | Characterization and simulation of metagenomic nanopore sequencing data with Meta-NanoSim |
title_sort | characterization and simulation of metagenomic nanopore sequencing data with meta-nanosim |
topic | Technical Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10025935/ https://www.ncbi.nlm.nih.gov/pubmed/36939007 http://dx.doi.org/10.1093/gigascience/giad013 |
work_keys_str_mv | AT yangchen characterizationandsimulationofmetagenomicnanoporesequencingdatawithmetananosim AT lotheodora characterizationandsimulationofmetagenomicnanoporesequencingdatawithmetananosim AT nipkaming characterizationandsimulationofmetagenomicnanoporesequencingdatawithmetananosim AT hafezqoranisaber characterizationandsimulationofmetagenomicnanoporesequencingdatawithmetananosim AT warrenrenel characterizationandsimulationofmetagenomicnanoporesequencingdatawithmetananosim AT birolinanc characterizationandsimulationofmetagenomicnanoporesequencingdatawithmetananosim |