Cargando…

SSPACE-LongRead: scaffolding bacterial draft genomes using long read sequence information

BACKGROUND: The recent introduction of the Pacific Biosciences RS single molecule sequencing technology has opened new doors to scaffolding genome assemblies in a cost-effective manner. The long read sequence information is promised to enhance the quality of incomplete and inaccurate draft assemblie...

Descripción completa

Detalles Bibliográficos
Autores principales: Boetzer, Marten, Pirovano, Walter
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4076250/
https://www.ncbi.nlm.nih.gov/pubmed/24950923
http://dx.doi.org/10.1186/1471-2105-15-211
_version_ 1782323462218448896
author Boetzer, Marten
Pirovano, Walter
author_facet Boetzer, Marten
Pirovano, Walter
author_sort Boetzer, Marten
collection PubMed
description BACKGROUND: The recent introduction of the Pacific Biosciences RS single molecule sequencing technology has opened new doors to scaffolding genome assemblies in a cost-effective manner. The long read sequence information is promised to enhance the quality of incomplete and inaccurate draft assemblies constructed from Next Generation Sequencing (NGS) data. RESULTS: Here we propose a novel hybrid assembly methodology that aims to scaffold pre-assembled contigs in an iterative manner using PacBio RS long read information as a backbone. On a test set comprising six bacterial draft genomes, assembled using either a single Illumina MiSeq or Roche 454 library, we show that even a 50× coverage of uncorrected PacBio RS long reads is sufficient to drastically reduce the number of contigs. Comparisons to the AHA scaffolder indicate our strategy is better capable of producing (nearly) complete bacterial genomes. CONCLUSIONS: The current work describes our SSPACE-LongRead software which is designed to upgrade incomplete draft genomes using single molecule sequences. We conclude that the recent advances of the PacBio sequencing technology and chemistry, in combination with the limited computational resources required to run our program, allow to scaffold genomes in a fast and reliable manner.
format Online
Article
Text
id pubmed-4076250
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-40762502014-07-01 SSPACE-LongRead: scaffolding bacterial draft genomes using long read sequence information Boetzer, Marten Pirovano, Walter BMC Bioinformatics Methodology Article BACKGROUND: The recent introduction of the Pacific Biosciences RS single molecule sequencing technology has opened new doors to scaffolding genome assemblies in a cost-effective manner. The long read sequence information is promised to enhance the quality of incomplete and inaccurate draft assemblies constructed from Next Generation Sequencing (NGS) data. RESULTS: Here we propose a novel hybrid assembly methodology that aims to scaffold pre-assembled contigs in an iterative manner using PacBio RS long read information as a backbone. On a test set comprising six bacterial draft genomes, assembled using either a single Illumina MiSeq or Roche 454 library, we show that even a 50× coverage of uncorrected PacBio RS long reads is sufficient to drastically reduce the number of contigs. Comparisons to the AHA scaffolder indicate our strategy is better capable of producing (nearly) complete bacterial genomes. CONCLUSIONS: The current work describes our SSPACE-LongRead software which is designed to upgrade incomplete draft genomes using single molecule sequences. We conclude that the recent advances of the PacBio sequencing technology and chemistry, in combination with the limited computational resources required to run our program, allow to scaffold genomes in a fast and reliable manner. BioMed Central 2014-06-20 /pmc/articles/PMC4076250/ /pubmed/24950923 http://dx.doi.org/10.1186/1471-2105-15-211 Text en Copyright © 2014 Boetzer and Pirovano; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Boetzer, Marten
Pirovano, Walter
SSPACE-LongRead: scaffolding bacterial draft genomes using long read sequence information
title SSPACE-LongRead: scaffolding bacterial draft genomes using long read sequence information
title_full SSPACE-LongRead: scaffolding bacterial draft genomes using long read sequence information
title_fullStr SSPACE-LongRead: scaffolding bacterial draft genomes using long read sequence information
title_full_unstemmed SSPACE-LongRead: scaffolding bacterial draft genomes using long read sequence information
title_short SSPACE-LongRead: scaffolding bacterial draft genomes using long read sequence information
title_sort sspace-longread: scaffolding bacterial draft genomes using long read sequence information
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4076250/
https://www.ncbi.nlm.nih.gov/pubmed/24950923
http://dx.doi.org/10.1186/1471-2105-15-211
work_keys_str_mv AT boetzermarten sspacelongreadscaffoldingbacterialdraftgenomesusinglongreadsequenceinformation
AT pirovanowalter sspacelongreadscaffoldingbacterialdraftgenomesusinglongreadsequenceinformation