Cargando…

Assessment of metrics in next-generation sequencing experiments for use in core-genome multilocus sequence type

With the reduction in the cost of next-generation sequencing, whole-genome sequencing (WGS)–based methods such as core-genome multilocus sequence type (cgMLST) have been widely used. However, gene-based methods are required to assemble raw reads to contigs, thus possibly introducing errors into asse...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Yen-Yi, Chen, Bo-Han, Chen, Chih-Chieh, Chiou, Chien-Shun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8380430/
https://www.ncbi.nlm.nih.gov/pubmed/34466283
http://dx.doi.org/10.7717/peerj.11842
_version_ 1783741196727746560
author Liu, Yen-Yi
Chen, Bo-Han
Chen, Chih-Chieh
Chiou, Chien-Shun
author_facet Liu, Yen-Yi
Chen, Bo-Han
Chen, Chih-Chieh
Chiou, Chien-Shun
author_sort Liu, Yen-Yi
collection PubMed
description With the reduction in the cost of next-generation sequencing, whole-genome sequencing (WGS)–based methods such as core-genome multilocus sequence type (cgMLST) have been widely used. However, gene-based methods are required to assemble raw reads to contigs, thus possibly introducing errors into assemblies. Because the robustness of cgMLST depends on the quality of assemblies, the results of WGS should be assessed (from sequencing to assembly). In this study, we investigated the robustness of different read lengths, read depths, and assemblers in recovering genes from reference genomes. Different combinations of read lengths and read depths were simulated from the complete genomes of three common food-borne pathogens: Escherichia coli, Listeria monocytogenes, and Salmonella enterica. We found that the quality of assemblies was mainly affected by read depth, irrespective of the assembler used. In addition, we suggest several cutoff values for future cgMLST experiments. Furthermore, we recommend the combinations of read lengths, read depths, and assemblers that can result in a higher cost/performance ratio for cgMLST.
format Online
Article
Text
id pubmed-8380430
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-83804302021-08-30 Assessment of metrics in next-generation sequencing experiments for use in core-genome multilocus sequence type Liu, Yen-Yi Chen, Bo-Han Chen, Chih-Chieh Chiou, Chien-Shun PeerJ Bioinformatics With the reduction in the cost of next-generation sequencing, whole-genome sequencing (WGS)–based methods such as core-genome multilocus sequence type (cgMLST) have been widely used. However, gene-based methods are required to assemble raw reads to contigs, thus possibly introducing errors into assemblies. Because the robustness of cgMLST depends on the quality of assemblies, the results of WGS should be assessed (from sequencing to assembly). In this study, we investigated the robustness of different read lengths, read depths, and assemblers in recovering genes from reference genomes. Different combinations of read lengths and read depths were simulated from the complete genomes of three common food-borne pathogens: Escherichia coli, Listeria monocytogenes, and Salmonella enterica. We found that the quality of assemblies was mainly affected by read depth, irrespective of the assembler used. In addition, we suggest several cutoff values for future cgMLST experiments. Furthermore, we recommend the combinations of read lengths, read depths, and assemblers that can result in a higher cost/performance ratio for cgMLST. PeerJ Inc. 2021-08-19 /pmc/articles/PMC8380430/ /pubmed/34466283 http://dx.doi.org/10.7717/peerj.11842 Text en ©2021 Liu et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Liu, Yen-Yi
Chen, Bo-Han
Chen, Chih-Chieh
Chiou, Chien-Shun
Assessment of metrics in next-generation sequencing experiments for use in core-genome multilocus sequence type
title Assessment of metrics in next-generation sequencing experiments for use in core-genome multilocus sequence type
title_full Assessment of metrics in next-generation sequencing experiments for use in core-genome multilocus sequence type
title_fullStr Assessment of metrics in next-generation sequencing experiments for use in core-genome multilocus sequence type
title_full_unstemmed Assessment of metrics in next-generation sequencing experiments for use in core-genome multilocus sequence type
title_short Assessment of metrics in next-generation sequencing experiments for use in core-genome multilocus sequence type
title_sort assessment of metrics in next-generation sequencing experiments for use in core-genome multilocus sequence type
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8380430/
https://www.ncbi.nlm.nih.gov/pubmed/34466283
http://dx.doi.org/10.7717/peerj.11842
work_keys_str_mv AT liuyenyi assessmentofmetricsinnextgenerationsequencingexperimentsforuseincoregenomemultilocussequencetype
AT chenbohan assessmentofmetricsinnextgenerationsequencingexperimentsforuseincoregenomemultilocussequencetype
AT chenchihchieh assessmentofmetricsinnextgenerationsequencingexperimentsforuseincoregenomemultilocussequencetype
AT chiouchienshun assessmentofmetricsinnextgenerationsequencingexperimentsforuseincoregenomemultilocussequencetype