Cargando…

Long read genome assemblers struggle with small plasmids

Whole-genome sequencing has become a preferred method for studying bacterial plasmids, as it is generally assumed to capture the entire genome. However, long-read genome assemblers have been shown to sometimes miss plasmid sequences – an issue that has been associated with plasmid size. The purpose...

Descripción completa

Detalles Bibliográficos
Autores principales: Johnson, Jared, Soehnlen, Marty, Blankenship, Heather M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Microbiology Society 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10272865/
https://www.ncbi.nlm.nih.gov/pubmed/37224062
http://dx.doi.org/10.1099/mgen.0.001024
_version_ 1785059593717547008
author Johnson, Jared
Soehnlen, Marty
Blankenship, Heather M.
author_facet Johnson, Jared
Soehnlen, Marty
Blankenship, Heather M.
author_sort Johnson, Jared
collection PubMed
description Whole-genome sequencing has become a preferred method for studying bacterial plasmids, as it is generally assumed to capture the entire genome. However, long-read genome assemblers have been shown to sometimes miss plasmid sequences – an issue that has been associated with plasmid size. The purpose of this study was to investigate the relationship between plasmid size and plasmid recovery by the long-read-only assemblers, Flye, Raven, Miniasm, and Canu. This was accomplished by determining the number of times each assembler successfully recovered 33 plasmids, ranging from 1919 to 194 062 bp in size and belonging to 14 bacterial isolates from six bacterial genera, using Oxford Nanopore long reads. These results were additionally compared to plasmid recovery rates by the short-read-first assembler, Unicycler, using both Oxford Nanopore long reads and Illumina short reads. Results from this study indicate that Canu, Flye, Miniasm, and Raven are prone to missing plasmid sequences, whereas Unicycler was successful at recovering 100 % of plasmid sequences. Excluding Canu, most plasmid loss by long-read-only assemblers was due to failure to recover plasmids smaller than 10 kb. As such, it is recommended that Unicycler be used to increase the likelihood of plasmid recovery during bacterial genome assembly.
format Online
Article
Text
id pubmed-10272865
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Microbiology Society
record_format MEDLINE/PubMed
spelling pubmed-102728652023-06-17 Long read genome assemblers struggle with small plasmids Johnson, Jared Soehnlen, Marty Blankenship, Heather M. Microb Genom Short Communications Whole-genome sequencing has become a preferred method for studying bacterial plasmids, as it is generally assumed to capture the entire genome. However, long-read genome assemblers have been shown to sometimes miss plasmid sequences – an issue that has been associated with plasmid size. The purpose of this study was to investigate the relationship between plasmid size and plasmid recovery by the long-read-only assemblers, Flye, Raven, Miniasm, and Canu. This was accomplished by determining the number of times each assembler successfully recovered 33 plasmids, ranging from 1919 to 194 062 bp in size and belonging to 14 bacterial isolates from six bacterial genera, using Oxford Nanopore long reads. These results were additionally compared to plasmid recovery rates by the short-read-first assembler, Unicycler, using both Oxford Nanopore long reads and Illumina short reads. Results from this study indicate that Canu, Flye, Miniasm, and Raven are prone to missing plasmid sequences, whereas Unicycler was successful at recovering 100 % of plasmid sequences. Excluding Canu, most plasmid loss by long-read-only assemblers was due to failure to recover plasmids smaller than 10 kb. As such, it is recommended that Unicycler be used to increase the likelihood of plasmid recovery during bacterial genome assembly. Microbiology Society 2023-05-24 /pmc/articles/PMC10272865/ /pubmed/37224062 http://dx.doi.org/10.1099/mgen.0.001024 Text en © 2023 The Authors https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License.
spellingShingle Short Communications
Johnson, Jared
Soehnlen, Marty
Blankenship, Heather M.
Long read genome assemblers struggle with small plasmids
title Long read genome assemblers struggle with small plasmids
title_full Long read genome assemblers struggle with small plasmids
title_fullStr Long read genome assemblers struggle with small plasmids
title_full_unstemmed Long read genome assemblers struggle with small plasmids
title_short Long read genome assemblers struggle with small plasmids
title_sort long read genome assemblers struggle with small plasmids
topic Short Communications
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10272865/
https://www.ncbi.nlm.nih.gov/pubmed/37224062
http://dx.doi.org/10.1099/mgen.0.001024
work_keys_str_mv AT johnsonjared longreadgenomeassemblersstrugglewithsmallplasmids
AT soehnlenmarty longreadgenomeassemblersstrugglewithsmallplasmids
AT blankenshipheatherm longreadgenomeassemblersstrugglewithsmallplasmids