Cargando…

Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads

The Illumina DNA sequencing platform generates accurate but short reads, which can be used to produce accurate but fragmented genome assemblies. Pacific Biosciences and Oxford Nanopore Technologies DNA sequencing platforms generate long reads that can produce complete genome assemblies, but the sequ...

Descripción completa

Detalles Bibliográficos
Autores principales: Wick, Ryan R., Judd, Louise M., Gorrie, Claire L., Holt, Kathryn E.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5481147/
https://www.ncbi.nlm.nih.gov/pubmed/28594827
http://dx.doi.org/10.1371/journal.pcbi.1005595
_version_ 1783245355047976960
author Wick, Ryan R.
Judd, Louise M.
Gorrie, Claire L.
Holt, Kathryn E.
author_facet Wick, Ryan R.
Judd, Louise M.
Gorrie, Claire L.
Holt, Kathryn E.
author_sort Wick, Ryan R.
collection PubMed
description The Illumina DNA sequencing platform generates accurate but short reads, which can be used to produce accurate but fragmented genome assemblies. Pacific Biosciences and Oxford Nanopore Technologies DNA sequencing platforms generate long reads that can produce complete genome assemblies, but the sequencing is more expensive and error-prone. There is significant interest in combining data from these complementary sequencing technologies to generate more accurate “hybrid” assemblies. However, few tools exist that truly leverage the benefits of both types of data, namely the accuracy of short reads and the structural resolving power of long reads. Here we present Unicycler, a new tool for assembling bacterial genomes from a combination of short and long reads, which produces assemblies that are accurate, complete and cost-effective. Unicycler builds an initial assembly graph from short reads using the de novo assembler SPAdes and then simplifies the graph using information from short and long reads. Unicycler uses a novel semi-global aligner to align long reads to the assembly graph. Tests on both synthetic and real reads show Unicycler can assemble larger contigs with fewer misassemblies than other hybrid assemblers, even when long-read depth and accuracy are low. Unicycler is open source (GPLv3) and available at github.com/rrwick/Unicycler.
format Online
Article
Text
id pubmed-5481147
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-54811472017-07-06 Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads Wick, Ryan R. Judd, Louise M. Gorrie, Claire L. Holt, Kathryn E. PLoS Comput Biol Research Article The Illumina DNA sequencing platform generates accurate but short reads, which can be used to produce accurate but fragmented genome assemblies. Pacific Biosciences and Oxford Nanopore Technologies DNA sequencing platforms generate long reads that can produce complete genome assemblies, but the sequencing is more expensive and error-prone. There is significant interest in combining data from these complementary sequencing technologies to generate more accurate “hybrid” assemblies. However, few tools exist that truly leverage the benefits of both types of data, namely the accuracy of short reads and the structural resolving power of long reads. Here we present Unicycler, a new tool for assembling bacterial genomes from a combination of short and long reads, which produces assemblies that are accurate, complete and cost-effective. Unicycler builds an initial assembly graph from short reads using the de novo assembler SPAdes and then simplifies the graph using information from short and long reads. Unicycler uses a novel semi-global aligner to align long reads to the assembly graph. Tests on both synthetic and real reads show Unicycler can assemble larger contigs with fewer misassemblies than other hybrid assemblers, even when long-read depth and accuracy are low. Unicycler is open source (GPLv3) and available at github.com/rrwick/Unicycler. Public Library of Science 2017-06-08 /pmc/articles/PMC5481147/ /pubmed/28594827 http://dx.doi.org/10.1371/journal.pcbi.1005595 Text en © 2017 Wick et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Wick, Ryan R.
Judd, Louise M.
Gorrie, Claire L.
Holt, Kathryn E.
Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads
title Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads
title_full Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads
title_fullStr Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads
title_full_unstemmed Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads
title_short Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads
title_sort unicycler: resolving bacterial genome assemblies from short and long sequencing reads
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5481147/
https://www.ncbi.nlm.nih.gov/pubmed/28594827
http://dx.doi.org/10.1371/journal.pcbi.1005595
work_keys_str_mv AT wickryanr unicyclerresolvingbacterialgenomeassembliesfromshortandlongsequencingreads
AT juddlouisem unicyclerresolvingbacterialgenomeassembliesfromshortandlongsequencingreads
AT gorrieclairel unicyclerresolvingbacterialgenomeassembliesfromshortandlongsequencingreads
AT holtkathryne unicyclerresolvingbacterialgenomeassembliesfromshortandlongsequencingreads