Cargando…
Long read genome assemblers struggle with small plasmids
Whole-genome sequencing has become a preferred method for studying bacterial plasmids, as it is generally assumed to capture the entire genome. However, long-read genome assemblers have been shown to sometimes miss plasmid sequences – an issue that has been associated with plasmid size. The purpose...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Microbiology Society
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10272865/ https://www.ncbi.nlm.nih.gov/pubmed/37224062 http://dx.doi.org/10.1099/mgen.0.001024 |
_version_ | 1785059593717547008 |
---|---|
author | Johnson, Jared Soehnlen, Marty Blankenship, Heather M. |
author_facet | Johnson, Jared Soehnlen, Marty Blankenship, Heather M. |
author_sort | Johnson, Jared |
collection | PubMed |
description | Whole-genome sequencing has become a preferred method for studying bacterial plasmids, as it is generally assumed to capture the entire genome. However, long-read genome assemblers have been shown to sometimes miss plasmid sequences – an issue that has been associated with plasmid size. The purpose of this study was to investigate the relationship between plasmid size and plasmid recovery by the long-read-only assemblers, Flye, Raven, Miniasm, and Canu. This was accomplished by determining the number of times each assembler successfully recovered 33 plasmids, ranging from 1919 to 194 062 bp in size and belonging to 14 bacterial isolates from six bacterial genera, using Oxford Nanopore long reads. These results were additionally compared to plasmid recovery rates by the short-read-first assembler, Unicycler, using both Oxford Nanopore long reads and Illumina short reads. Results from this study indicate that Canu, Flye, Miniasm, and Raven are prone to missing plasmid sequences, whereas Unicycler was successful at recovering 100 % of plasmid sequences. Excluding Canu, most plasmid loss by long-read-only assemblers was due to failure to recover plasmids smaller than 10 kb. As such, it is recommended that Unicycler be used to increase the likelihood of plasmid recovery during bacterial genome assembly. |
format | Online Article Text |
id | pubmed-10272865 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Microbiology Society |
record_format | MEDLINE/PubMed |
spelling | pubmed-102728652023-06-17 Long read genome assemblers struggle with small plasmids Johnson, Jared Soehnlen, Marty Blankenship, Heather M. Microb Genom Short Communications Whole-genome sequencing has become a preferred method for studying bacterial plasmids, as it is generally assumed to capture the entire genome. However, long-read genome assemblers have been shown to sometimes miss plasmid sequences – an issue that has been associated with plasmid size. The purpose of this study was to investigate the relationship between plasmid size and plasmid recovery by the long-read-only assemblers, Flye, Raven, Miniasm, and Canu. This was accomplished by determining the number of times each assembler successfully recovered 33 plasmids, ranging from 1919 to 194 062 bp in size and belonging to 14 bacterial isolates from six bacterial genera, using Oxford Nanopore long reads. These results were additionally compared to plasmid recovery rates by the short-read-first assembler, Unicycler, using both Oxford Nanopore long reads and Illumina short reads. Results from this study indicate that Canu, Flye, Miniasm, and Raven are prone to missing plasmid sequences, whereas Unicycler was successful at recovering 100 % of plasmid sequences. Excluding Canu, most plasmid loss by long-read-only assemblers was due to failure to recover plasmids smaller than 10 kb. As such, it is recommended that Unicycler be used to increase the likelihood of plasmid recovery during bacterial genome assembly. Microbiology Society 2023-05-24 /pmc/articles/PMC10272865/ /pubmed/37224062 http://dx.doi.org/10.1099/mgen.0.001024 Text en © 2023 The Authors https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License. |
spellingShingle | Short Communications Johnson, Jared Soehnlen, Marty Blankenship, Heather M. Long read genome assemblers struggle with small plasmids |
title | Long read genome assemblers struggle with small plasmids |
title_full | Long read genome assemblers struggle with small plasmids |
title_fullStr | Long read genome assemblers struggle with small plasmids |
title_full_unstemmed | Long read genome assemblers struggle with small plasmids |
title_short | Long read genome assemblers struggle with small plasmids |
title_sort | long read genome assemblers struggle with small plasmids |
topic | Short Communications |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10272865/ https://www.ncbi.nlm.nih.gov/pubmed/37224062 http://dx.doi.org/10.1099/mgen.0.001024 |
work_keys_str_mv | AT johnsonjared longreadgenomeassemblersstrugglewithsmallplasmids AT soehnlenmarty longreadgenomeassemblersstrugglewithsmallplasmids AT blankenshipheatherm longreadgenomeassemblersstrugglewithsmallplasmids |