Cargando…

Pebble and Rock Band: Heuristic Resolution of Repeats and Scaffolding in the Velvet Short-Read de Novo Assembler

BACKGROUND: Despite the short length of their reads, micro-read sequencing technologies have shown their usefulness for de novo sequencing. However, especially in eukaryotic genomes, complex repeat patterns are an obstacle to large assemblies. PRINCIPAL FINDINGS: We present a novel heuristic algorit...

Descripción completa

Detalles Bibliográficos
Autores principales: Zerbino, Daniel R., McEwen, Gayle K., Margulies, Elliott H., Birney, Ewan
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2793427/
https://www.ncbi.nlm.nih.gov/pubmed/20027311
http://dx.doi.org/10.1371/journal.pone.0008407
_version_ 1782175320316575744
author Zerbino, Daniel R.
McEwen, Gayle K.
Margulies, Elliott H.
Birney, Ewan
author_facet Zerbino, Daniel R.
McEwen, Gayle K.
Margulies, Elliott H.
Birney, Ewan
author_sort Zerbino, Daniel R.
collection PubMed
description BACKGROUND: Despite the short length of their reads, micro-read sequencing technologies have shown their usefulness for de novo sequencing. However, especially in eukaryotic genomes, complex repeat patterns are an obstacle to large assemblies. PRINCIPAL FINDINGS: We present a novel heuristic algorithm, Pebble, which uses paired-end read information to resolve repeats and scaffold contigs to produce large-scale assemblies. In simulations, we can achieve weighted median scaffold lengths (N50) of above 1 Mbp in Bacteria and above 100 kbp in more complex organisms. Using real datasets we obtained a 96 kbp N50 in Pseudomonas syringae and a unique 147 kbp scaffold of a ferret BAC clone. We also present an efficient algorithm called Rock Band for the resolution of repeats in the case of mixed length assemblies, where different sequencing platforms are combined to obtain a cost-effective assembly. CONCLUSIONS: These algorithms extend the utility of short read only assemblies into large complex genomes. They have been implemented and made available within the open-source Velvet short-read de novo assembler.
format Text
id pubmed-2793427
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-27934272009-12-22 Pebble and Rock Band: Heuristic Resolution of Repeats and Scaffolding in the Velvet Short-Read de Novo Assembler Zerbino, Daniel R. McEwen, Gayle K. Margulies, Elliott H. Birney, Ewan PLoS One Research Article BACKGROUND: Despite the short length of their reads, micro-read sequencing technologies have shown their usefulness for de novo sequencing. However, especially in eukaryotic genomes, complex repeat patterns are an obstacle to large assemblies. PRINCIPAL FINDINGS: We present a novel heuristic algorithm, Pebble, which uses paired-end read information to resolve repeats and scaffold contigs to produce large-scale assemblies. In simulations, we can achieve weighted median scaffold lengths (N50) of above 1 Mbp in Bacteria and above 100 kbp in more complex organisms. Using real datasets we obtained a 96 kbp N50 in Pseudomonas syringae and a unique 147 kbp scaffold of a ferret BAC clone. We also present an efficient algorithm called Rock Band for the resolution of repeats in the case of mixed length assemblies, where different sequencing platforms are combined to obtain a cost-effective assembly. CONCLUSIONS: These algorithms extend the utility of short read only assemblies into large complex genomes. They have been implemented and made available within the open-source Velvet short-read de novo assembler. Public Library of Science 2009-12-22 /pmc/articles/PMC2793427/ /pubmed/20027311 http://dx.doi.org/10.1371/journal.pone.0008407 Text en This is an open-access article distributed under the terms of the Creative Commons Public Domain declaration which stipulates that, once placed in the public domain, this work may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. https://creativecommons.org/publicdomain/zero/1.0/ This is an open-access article distributed under the terms of the Creative Commons Public Domain declaration, which stipulates that, once placed in the public domain, this work may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose.
spellingShingle Research Article
Zerbino, Daniel R.
McEwen, Gayle K.
Margulies, Elliott H.
Birney, Ewan
Pebble and Rock Band: Heuristic Resolution of Repeats and Scaffolding in the Velvet Short-Read de Novo Assembler
title Pebble and Rock Band: Heuristic Resolution of Repeats and Scaffolding in the Velvet Short-Read de Novo Assembler
title_full Pebble and Rock Band: Heuristic Resolution of Repeats and Scaffolding in the Velvet Short-Read de Novo Assembler
title_fullStr Pebble and Rock Band: Heuristic Resolution of Repeats and Scaffolding in the Velvet Short-Read de Novo Assembler
title_full_unstemmed Pebble and Rock Band: Heuristic Resolution of Repeats and Scaffolding in the Velvet Short-Read de Novo Assembler
title_short Pebble and Rock Band: Heuristic Resolution of Repeats and Scaffolding in the Velvet Short-Read de Novo Assembler
title_sort pebble and rock band: heuristic resolution of repeats and scaffolding in the velvet short-read de novo assembler
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2793427/
https://www.ncbi.nlm.nih.gov/pubmed/20027311
http://dx.doi.org/10.1371/journal.pone.0008407
work_keys_str_mv AT zerbinodanielr pebbleandrockbandheuristicresolutionofrepeatsandscaffoldinginthevelvetshortreaddenovoassembler
AT mcewengaylek pebbleandrockbandheuristicresolutionofrepeatsandscaffoldinginthevelvetshortreaddenovoassembler
AT margulieselliotth pebbleandrockbandheuristicresolutionofrepeatsandscaffoldinginthevelvetshortreaddenovoassembler
AT birneyewan pebbleandrockbandheuristicresolutionofrepeatsandscaffoldinginthevelvetshortreaddenovoassembler