Cargando…

CuReSim-LoRM: A Tool to Simulate Metabarcoding Long Reads

Metabarcoding DNA sequencing has revolutionized the study of microbial communities. Third-generation sequencing producing long reads had opened up new perspectives. Obtaining the full-length ribosomal RNA gene would permit one to reach a better taxonomic resolution at the species or the strain level...

Descripción completa

Detalles Bibliográficos
Autores principales: Mesloub, Yasmina, Beury, Delphine, Vandermeeren, Félix, Caboche, Ségolène
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10531135/
https://www.ncbi.nlm.nih.gov/pubmed/37762307
http://dx.doi.org/10.3390/ijms241814005
_version_ 1785111647211225088
author Mesloub, Yasmina
Beury, Delphine
Vandermeeren, Félix
Caboche, Ségolène
author_facet Mesloub, Yasmina
Beury, Delphine
Vandermeeren, Félix
Caboche, Ségolène
author_sort Mesloub, Yasmina
collection PubMed
description Metabarcoding DNA sequencing has revolutionized the study of microbial communities. Third-generation sequencing producing long reads had opened up new perspectives. Obtaining the full-length ribosomal RNA gene would permit one to reach a better taxonomic resolution at the species or the strain level. However, Oxford Nanopore Technologies (ONT) sequencing produces reads with high error rates, which introduces biases in analysis. Understanding the biases introduced during the analysis allows one to better interpret the biological results and take care of conclusions drawn from metabarcoding experiments. To benchmark an analysis process, the ground truth, i.e., the real composition of the microbial community, has to be known. In addition to artificial mock communities, simulated data are often used to evaluate the biases and performances of the bioinformatics analysis step. Currently, no specific tool has been developed to simulate metabarcoding long reads, mimic the error rate and the length distribution, and allow one to benchmark the analysis process. Here, we introduce CuReSim-LoRM, for the customized read simulator to generate long reads for metabarcoding. We showed that CuReSim-LoRM is able to produce reads with varying error rates and length distributions by mimicking the real data very well.
format Online
Article
Text
id pubmed-10531135
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-105311352023-09-28 CuReSim-LoRM: A Tool to Simulate Metabarcoding Long Reads Mesloub, Yasmina Beury, Delphine Vandermeeren, Félix Caboche, Ségolène Int J Mol Sci Article Metabarcoding DNA sequencing has revolutionized the study of microbial communities. Third-generation sequencing producing long reads had opened up new perspectives. Obtaining the full-length ribosomal RNA gene would permit one to reach a better taxonomic resolution at the species or the strain level. However, Oxford Nanopore Technologies (ONT) sequencing produces reads with high error rates, which introduces biases in analysis. Understanding the biases introduced during the analysis allows one to better interpret the biological results and take care of conclusions drawn from metabarcoding experiments. To benchmark an analysis process, the ground truth, i.e., the real composition of the microbial community, has to be known. In addition to artificial mock communities, simulated data are often used to evaluate the biases and performances of the bioinformatics analysis step. Currently, no specific tool has been developed to simulate metabarcoding long reads, mimic the error rate and the length distribution, and allow one to benchmark the analysis process. Here, we introduce CuReSim-LoRM, for the customized read simulator to generate long reads for metabarcoding. We showed that CuReSim-LoRM is able to produce reads with varying error rates and length distributions by mimicking the real data very well. MDPI 2023-09-12 /pmc/articles/PMC10531135/ /pubmed/37762307 http://dx.doi.org/10.3390/ijms241814005 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Mesloub, Yasmina
Beury, Delphine
Vandermeeren, Félix
Caboche, Ségolène
CuReSim-LoRM: A Tool to Simulate Metabarcoding Long Reads
title CuReSim-LoRM: A Tool to Simulate Metabarcoding Long Reads
title_full CuReSim-LoRM: A Tool to Simulate Metabarcoding Long Reads
title_fullStr CuReSim-LoRM: A Tool to Simulate Metabarcoding Long Reads
title_full_unstemmed CuReSim-LoRM: A Tool to Simulate Metabarcoding Long Reads
title_short CuReSim-LoRM: A Tool to Simulate Metabarcoding Long Reads
title_sort curesim-lorm: a tool to simulate metabarcoding long reads
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10531135/
https://www.ncbi.nlm.nih.gov/pubmed/37762307
http://dx.doi.org/10.3390/ijms241814005
work_keys_str_mv AT mesloubyasmina curesimlormatooltosimulatemetabarcodinglongreads
AT beurydelphine curesimlormatooltosimulatemetabarcodinglongreads
AT vandermeerenfelix curesimlormatooltosimulatemetabarcodinglongreads
AT cabochesegolene curesimlormatooltosimulatemetabarcodinglongreads