Cargando…

Evaluating the accuracy of Listeria monocytogenes assemblies from quasimetagenomic samples using long and short reads

BACKGROUND: Whole genome sequencing of cultured pathogens is the state of the art public health response for the bioinformatic source tracking of illness outbreaks. Quasimetagenomics can substantially reduce the amount of culturing needed before a high quality genome can be recovered. Highly accurat...

Descripción completa

Detalles Bibliográficos
Autores principales: Commichaux, Seth, Javkar, Kiran, Ramachandran, Padmini, Nagarajan, Niranjan, Bertrand, Denis, Chen, Yi, Reed, Elizabeth, Gonzalez-Escalona, Narjol, Strain, Errol, Rand, Hugh, Pop, Mihai, Ottesen, Andrea
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8157722/
https://www.ncbi.nlm.nih.gov/pubmed/34039264
http://dx.doi.org/10.1186/s12864-021-07702-2
_version_ 1783699744930922496
author Commichaux, Seth
Javkar, Kiran
Ramachandran, Padmini
Nagarajan, Niranjan
Bertrand, Denis
Chen, Yi
Reed, Elizabeth
Gonzalez-Escalona, Narjol
Strain, Errol
Rand, Hugh
Pop, Mihai
Ottesen, Andrea
author_facet Commichaux, Seth
Javkar, Kiran
Ramachandran, Padmini
Nagarajan, Niranjan
Bertrand, Denis
Chen, Yi
Reed, Elizabeth
Gonzalez-Escalona, Narjol
Strain, Errol
Rand, Hugh
Pop, Mihai
Ottesen, Andrea
author_sort Commichaux, Seth
collection PubMed
description BACKGROUND: Whole genome sequencing of cultured pathogens is the state of the art public health response for the bioinformatic source tracking of illness outbreaks. Quasimetagenomics can substantially reduce the amount of culturing needed before a high quality genome can be recovered. Highly accurate short read data is analyzed for single nucleotide polymorphisms and multi-locus sequence types to differentiate strains but cannot span many genomic repeats, resulting in highly fragmented assemblies. Long reads can span repeats, resulting in much more contiguous assemblies, but have lower accuracy than short reads. RESULTS: We evaluated the accuracy of Listeria monocytogenes assemblies from enrichments (quasimetagenomes) of naturally-contaminated ice cream using long read (Oxford Nanopore) and short read (Illumina) sequencing data. Accuracy of ten assembly approaches, over a range of sequencing depths, was evaluated by comparing sequence similarity of genes in assemblies to a complete reference genome. Long read assemblies reconstructed a circularized genome as well as a 71 kbp plasmid after 24 h of enrichment; however, high error rates prevented high fidelity gene assembly, even at 150X depth of coverage. Short read assemblies accurately reconstructed the core genes after 28 h of enrichment but produced highly fragmented genomes. Hybrid approaches demonstrated promising results but had biases based upon the initial assembly strategy. Short read assemblies scaffolded with long reads accurately assembled the core genes after just 24 h of enrichment, but were highly fragmented. Long read assemblies polished with short reads reconstructed a circularized genome and plasmid and assembled all the genes after 24 h enrichment but with less fidelity for the core genes than the short read assemblies. CONCLUSION: The integration of long and short read sequencing of quasimetagenomes expedited the reconstruction of a high quality pathogen genome compared to either platform alone. A new and more complete level of information about genome structure, gene order and mobile elements can be added to the public health response by incorporating long read analyses with the standard short read WGS outbreak response. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-021-07702-2.
format Online
Article
Text
id pubmed-8157722
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-81577222021-05-28 Evaluating the accuracy of Listeria monocytogenes assemblies from quasimetagenomic samples using long and short reads Commichaux, Seth Javkar, Kiran Ramachandran, Padmini Nagarajan, Niranjan Bertrand, Denis Chen, Yi Reed, Elizabeth Gonzalez-Escalona, Narjol Strain, Errol Rand, Hugh Pop, Mihai Ottesen, Andrea BMC Genomics Research Article BACKGROUND: Whole genome sequencing of cultured pathogens is the state of the art public health response for the bioinformatic source tracking of illness outbreaks. Quasimetagenomics can substantially reduce the amount of culturing needed before a high quality genome can be recovered. Highly accurate short read data is analyzed for single nucleotide polymorphisms and multi-locus sequence types to differentiate strains but cannot span many genomic repeats, resulting in highly fragmented assemblies. Long reads can span repeats, resulting in much more contiguous assemblies, but have lower accuracy than short reads. RESULTS: We evaluated the accuracy of Listeria monocytogenes assemblies from enrichments (quasimetagenomes) of naturally-contaminated ice cream using long read (Oxford Nanopore) and short read (Illumina) sequencing data. Accuracy of ten assembly approaches, over a range of sequencing depths, was evaluated by comparing sequence similarity of genes in assemblies to a complete reference genome. Long read assemblies reconstructed a circularized genome as well as a 71 kbp plasmid after 24 h of enrichment; however, high error rates prevented high fidelity gene assembly, even at 150X depth of coverage. Short read assemblies accurately reconstructed the core genes after 28 h of enrichment but produced highly fragmented genomes. Hybrid approaches demonstrated promising results but had biases based upon the initial assembly strategy. Short read assemblies scaffolded with long reads accurately assembled the core genes after just 24 h of enrichment, but were highly fragmented. Long read assemblies polished with short reads reconstructed a circularized genome and plasmid and assembled all the genes after 24 h enrichment but with less fidelity for the core genes than the short read assemblies. CONCLUSION: The integration of long and short read sequencing of quasimetagenomes expedited the reconstruction of a high quality pathogen genome compared to either platform alone. A new and more complete level of information about genome structure, gene order and mobile elements can be added to the public health response by incorporating long read analyses with the standard short read WGS outbreak response. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-021-07702-2. BioMed Central 2021-05-26 /pmc/articles/PMC8157722/ /pubmed/34039264 http://dx.doi.org/10.1186/s12864-021-07702-2 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research Article
Commichaux, Seth
Javkar, Kiran
Ramachandran, Padmini
Nagarajan, Niranjan
Bertrand, Denis
Chen, Yi
Reed, Elizabeth
Gonzalez-Escalona, Narjol
Strain, Errol
Rand, Hugh
Pop, Mihai
Ottesen, Andrea
Evaluating the accuracy of Listeria monocytogenes assemblies from quasimetagenomic samples using long and short reads
title Evaluating the accuracy of Listeria monocytogenes assemblies from quasimetagenomic samples using long and short reads
title_full Evaluating the accuracy of Listeria monocytogenes assemblies from quasimetagenomic samples using long and short reads
title_fullStr Evaluating the accuracy of Listeria monocytogenes assemblies from quasimetagenomic samples using long and short reads
title_full_unstemmed Evaluating the accuracy of Listeria monocytogenes assemblies from quasimetagenomic samples using long and short reads
title_short Evaluating the accuracy of Listeria monocytogenes assemblies from quasimetagenomic samples using long and short reads
title_sort evaluating the accuracy of listeria monocytogenes assemblies from quasimetagenomic samples using long and short reads
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8157722/
https://www.ncbi.nlm.nih.gov/pubmed/34039264
http://dx.doi.org/10.1186/s12864-021-07702-2
work_keys_str_mv AT commichauxseth evaluatingtheaccuracyoflisteriamonocytogenesassembliesfromquasimetagenomicsamplesusinglongandshortreads
AT javkarkiran evaluatingtheaccuracyoflisteriamonocytogenesassembliesfromquasimetagenomicsamplesusinglongandshortreads
AT ramachandranpadmini evaluatingtheaccuracyoflisteriamonocytogenesassembliesfromquasimetagenomicsamplesusinglongandshortreads
AT nagarajanniranjan evaluatingtheaccuracyoflisteriamonocytogenesassembliesfromquasimetagenomicsamplesusinglongandshortreads
AT bertranddenis evaluatingtheaccuracyoflisteriamonocytogenesassembliesfromquasimetagenomicsamplesusinglongandshortreads
AT chenyi evaluatingtheaccuracyoflisteriamonocytogenesassembliesfromquasimetagenomicsamplesusinglongandshortreads
AT reedelizabeth evaluatingtheaccuracyoflisteriamonocytogenesassembliesfromquasimetagenomicsamplesusinglongandshortreads
AT gonzalezescalonanarjol evaluatingtheaccuracyoflisteriamonocytogenesassembliesfromquasimetagenomicsamplesusinglongandshortreads
AT strainerrol evaluatingtheaccuracyoflisteriamonocytogenesassembliesfromquasimetagenomicsamplesusinglongandshortreads
AT randhugh evaluatingtheaccuracyoflisteriamonocytogenesassembliesfromquasimetagenomicsamplesusinglongandshortreads
AT popmihai evaluatingtheaccuracyoflisteriamonocytogenesassembliesfromquasimetagenomicsamplesusinglongandshortreads
AT ottesenandrea evaluatingtheaccuracyoflisteriamonocytogenesassembliesfromquasimetagenomicsamplesusinglongandshortreads