Cargando…

Scaffolding and completing genome assemblies in real-time with nanopore sequencing

Third generation sequencing technologies provide the opportunity to improve genome assemblies by generating long reads spanning most repeat sequences. However, current analysis methods require substantial amounts of sequence data and computational resources to overcome the high error rates. Furtherm...

Descripción completa

Detalles Bibliográficos
Autores principales: Cao, Minh Duc, Nguyen, Son Hoang, Ganesamoorthy, Devika, Elliott, Alysha G., Cooper, Matthew A., Coin, Lachlan J. M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5321748/
https://www.ncbi.nlm.nih.gov/pubmed/28218240
http://dx.doi.org/10.1038/ncomms14515
_version_ 1782509732150378496
author Cao, Minh Duc
Nguyen, Son Hoang
Ganesamoorthy, Devika
Elliott, Alysha G.
Cooper, Matthew A.
Coin, Lachlan J. M.
author_facet Cao, Minh Duc
Nguyen, Son Hoang
Ganesamoorthy, Devika
Elliott, Alysha G.
Cooper, Matthew A.
Coin, Lachlan J. M.
author_sort Cao, Minh Duc
collection PubMed
description Third generation sequencing technologies provide the opportunity to improve genome assemblies by generating long reads spanning most repeat sequences. However, current analysis methods require substantial amounts of sequence data and computational resources to overcome the high error rates. Furthermore, they can only perform analysis after sequencing has completed, resulting in either over-sequencing, or in a low quality assembly due to under-sequencing. Here we present npScarf, which can scaffold and complete short read assemblies while the long read sequencing run is in progress. It reports assembly metrics in real-time so the sequencing run can be terminated once an assembly of sufficient quality is obtained. In assembling four bacterial and one eukaryotic genomes, we show that npScarf can construct more complete and accurate assemblies while requiring less sequencing data and computational resources than existing methods. Our approach offers a time- and resource-effective strategy for completing short read assemblies.
format Online
Article
Text
id pubmed-5321748
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-53217482017-03-01 Scaffolding and completing genome assemblies in real-time with nanopore sequencing Cao, Minh Duc Nguyen, Son Hoang Ganesamoorthy, Devika Elliott, Alysha G. Cooper, Matthew A. Coin, Lachlan J. M. Nat Commun Article Third generation sequencing technologies provide the opportunity to improve genome assemblies by generating long reads spanning most repeat sequences. However, current analysis methods require substantial amounts of sequence data and computational resources to overcome the high error rates. Furthermore, they can only perform analysis after sequencing has completed, resulting in either over-sequencing, or in a low quality assembly due to under-sequencing. Here we present npScarf, which can scaffold and complete short read assemblies while the long read sequencing run is in progress. It reports assembly metrics in real-time so the sequencing run can be terminated once an assembly of sufficient quality is obtained. In assembling four bacterial and one eukaryotic genomes, we show that npScarf can construct more complete and accurate assemblies while requiring less sequencing data and computational resources than existing methods. Our approach offers a time- and resource-effective strategy for completing short read assemblies. Nature Publishing Group 2017-02-20 /pmc/articles/PMC5321748/ /pubmed/28218240 http://dx.doi.org/10.1038/ncomms14515 Text en Copyright © 2017, The Author(s) http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle Article
Cao, Minh Duc
Nguyen, Son Hoang
Ganesamoorthy, Devika
Elliott, Alysha G.
Cooper, Matthew A.
Coin, Lachlan J. M.
Scaffolding and completing genome assemblies in real-time with nanopore sequencing
title Scaffolding and completing genome assemblies in real-time with nanopore sequencing
title_full Scaffolding and completing genome assemblies in real-time with nanopore sequencing
title_fullStr Scaffolding and completing genome assemblies in real-time with nanopore sequencing
title_full_unstemmed Scaffolding and completing genome assemblies in real-time with nanopore sequencing
title_short Scaffolding and completing genome assemblies in real-time with nanopore sequencing
title_sort scaffolding and completing genome assemblies in real-time with nanopore sequencing
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5321748/
https://www.ncbi.nlm.nih.gov/pubmed/28218240
http://dx.doi.org/10.1038/ncomms14515
work_keys_str_mv AT caominhduc scaffoldingandcompletinggenomeassembliesinrealtimewithnanoporesequencing
AT nguyensonhoang scaffoldingandcompletinggenomeassembliesinrealtimewithnanoporesequencing
AT ganesamoorthydevika scaffoldingandcompletinggenomeassembliesinrealtimewithnanoporesequencing
AT elliottalyshag scaffoldingandcompletinggenomeassembliesinrealtimewithnanoporesequencing
AT coopermatthewa scaffoldingandcompletinggenomeassembliesinrealtimewithnanoporesequencing
AT coinlachlanjm scaffoldingandcompletinggenomeassembliesinrealtimewithnanoporesequencing