Cargando…

Assessment of metrics in next-generation sequencing experiments for use in core-genome multilocus sequence type

With the reduction in the cost of next-generation sequencing, whole-genome sequencing (WGS)–based methods such as core-genome multilocus sequence type (cgMLST) have been widely used. However, gene-based methods are required to assemble raw reads to contigs, thus possibly introducing errors into asse...

Descripción completa

Detalles Bibliográficos
Autores principales:	Liu, Yen-Yi, Chen, Bo-Han, Chen, Chih-Chieh, Chiou, Chien-Shun
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	PeerJ Inc. 2021
Materias:	Bioinformatics
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8380430/ https://www.ncbi.nlm.nih.gov/pubmed/34466283 http://dx.doi.org/10.7717/peerj.11842

_version_	1783741196727746560
author	Liu, Yen-Yi Chen, Bo-Han Chen, Chih-Chieh Chiou, Chien-Shun
author_facet	Liu, Yen-Yi Chen, Bo-Han Chen, Chih-Chieh Chiou, Chien-Shun
author_sort	Liu, Yen-Yi
collection	PubMed
description	With the reduction in the cost of next-generation sequencing, whole-genome sequencing (WGS)–based methods such as core-genome multilocus sequence type (cgMLST) have been widely used. However, gene-based methods are required to assemble raw reads to contigs, thus possibly introducing errors into assemblies. Because the robustness of cgMLST depends on the quality of assemblies, the results of WGS should be assessed (from sequencing to assembly). In this study, we investigated the robustness of different read lengths, read depths, and assemblers in recovering genes from reference genomes. Different combinations of read lengths and read depths were simulated from the complete genomes of three common food-borne pathogens: Escherichia coli, Listeria monocytogenes, and Salmonella enterica. We found that the quality of assemblies was mainly affected by read depth, irrespective of the assembler used. In addition, we suggest several cutoff values for future cgMLST experiments. Furthermore, we recommend the combinations of read lengths, read depths, and assemblers that can result in a higher cost/performance ratio for cgMLST.
format	Online Article Text
id	pubmed-8380430
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	PeerJ Inc.
record_format	MEDLINE/PubMed
spelling	pubmed-83804302021-08-30 Assessment of metrics in next-generation sequencing experiments for use in core-genome multilocus sequence type Liu, Yen-Yi Chen, Bo-Han Chen, Chih-Chieh Chiou, Chien-Shun PeerJ Bioinformatics With the reduction in the cost of next-generation sequencing, whole-genome sequencing (WGS)–based methods such as core-genome multilocus sequence type (cgMLST) have been widely used. However, gene-based methods are required to assemble raw reads to contigs, thus possibly introducing errors into assemblies. Because the robustness of cgMLST depends on the quality of assemblies, the results of WGS should be assessed (from sequencing to assembly). In this study, we investigated the robustness of different read lengths, read depths, and assemblers in recovering genes from reference genomes. Different combinations of read lengths and read depths were simulated from the complete genomes of three common food-borne pathogens: Escherichia coli, Listeria monocytogenes, and Salmonella enterica. We found that the quality of assemblies was mainly affected by read depth, irrespective of the assembler used. In addition, we suggest several cutoff values for future cgMLST experiments. Furthermore, we recommend the combinations of read lengths, read depths, and assemblers that can result in a higher cost/performance ratio for cgMLST. PeerJ Inc. 2021-08-19 /pmc/articles/PMC8380430/ /pubmed/34466283 http://dx.doi.org/10.7717/peerj.11842 Text en ©2021 Liu et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle	Bioinformatics Liu, Yen-Yi Chen, Bo-Han Chen, Chih-Chieh Chiou, Chien-Shun Assessment of metrics in next-generation sequencing experiments for use in core-genome multilocus sequence type
title	Assessment of metrics in next-generation sequencing experiments for use in core-genome multilocus sequence type
title_full	Assessment of metrics in next-generation sequencing experiments for use in core-genome multilocus sequence type
title_fullStr	Assessment of metrics in next-generation sequencing experiments for use in core-genome multilocus sequence type
title_full_unstemmed	Assessment of metrics in next-generation sequencing experiments for use in core-genome multilocus sequence type
title_short	Assessment of metrics in next-generation sequencing experiments for use in core-genome multilocus sequence type
title_sort	assessment of metrics in next-generation sequencing experiments for use in core-genome multilocus sequence type
topic	Bioinformatics
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8380430/ https://www.ncbi.nlm.nih.gov/pubmed/34466283 http://dx.doi.org/10.7717/peerj.11842
work_keys_str_mv	AT liuyenyi assessmentofmetricsinnextgenerationsequencingexperimentsforuseincoregenomemultilocussequencetype AT chenbohan assessmentofmetricsinnextgenerationsequencingexperimentsforuseincoregenomemultilocussequencetype AT chenchihchieh assessmentofmetricsinnextgenerationsequencingexperimentsforuseincoregenomemultilocussequencetype AT chiouchienshun assessmentofmetricsinnextgenerationsequencingexperimentsforuseincoregenomemultilocussequencetype

Assessment of metrics in next-generation sequencing experiments for use in core-genome multilocus sequence type

Ejemplares similares