Cargando…
Assembling millions of short DNA sequences using SSAKE
Summary: Novel DNA sequencing technologies with the potential for up to three orders magnitude more sequence throughput than conventional Sanger sequencing are emerging. The instrument now available from Solexa Ltd, produces millions of short DNA sequences of 25 nt each. Due to ubiquitous repeats in...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2007
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7109930/ https://www.ncbi.nlm.nih.gov/pubmed/17158514 http://dx.doi.org/10.1093/bioinformatics/btl629 |
_version_ | 1783513004586827776 |
---|---|
author | Warren, René L. Sutton, Granger G. Jones, Steven J. M. Holt, Robert A. |
author_facet | Warren, René L. Sutton, Granger G. Jones, Steven J. M. Holt, Robert A. |
author_sort | Warren, René L. |
collection | PubMed |
description | Summary: Novel DNA sequencing technologies with the potential for up to three orders magnitude more sequence throughput than conventional Sanger sequencing are emerging. The instrument now available from Solexa Ltd, produces millions of short DNA sequences of 25 nt each. Due to ubiquitous repeats in large genomes and the inability of short sequences to uniquely and unambiguously characterize them, the short read length limits applicability for de novo sequencing. However, given the sequencing depth and the throughput of this instrument, stringent assembly of highly identical sequences can be achieved. We describe SSAKE, a tool for aggressively assembling millions of short nucleotide sequences by progressively searching through a prefix tree for the longest possible overlap between any two sequences. SSAKE is designed to help leverage the information from short sequence reads by stringently assembling them into contiguous sequences that can be used to characterize novel sequencing targets. Availability: Contact: rwarren@bcgsc.ca |
format | Online Article Text |
id | pubmed-7109930 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2007 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-71099302020-04-02 Assembling millions of short DNA sequences using SSAKE Warren, René L. Sutton, Granger G. Jones, Steven J. M. Holt, Robert A. Bioinformatics Applications Notes Summary: Novel DNA sequencing technologies with the potential for up to three orders magnitude more sequence throughput than conventional Sanger sequencing are emerging. The instrument now available from Solexa Ltd, produces millions of short DNA sequences of 25 nt each. Due to ubiquitous repeats in large genomes and the inability of short sequences to uniquely and unambiguously characterize them, the short read length limits applicability for de novo sequencing. However, given the sequencing depth and the throughput of this instrument, stringent assembly of highly identical sequences can be achieved. We describe SSAKE, a tool for aggressively assembling millions of short nucleotide sequences by progressively searching through a prefix tree for the longest possible overlap between any two sequences. SSAKE is designed to help leverage the information from short sequence reads by stringently assembling them into contiguous sequences that can be used to characterize novel sequencing targets. Availability: Contact: rwarren@bcgsc.ca Oxford University Press 2007-02-15 2006-12-08 /pmc/articles/PMC7109930/ /pubmed/17158514 http://dx.doi.org/10.1093/bioinformatics/btl629 Text en © 2006 The Author(s) This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. This article is made available via the PMC Open Access Subset for unrestricted re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the COVID-19 pandemic or until permissions are revoked in writing. Upon expiration of these permissions, PMC is granted a perpetual license to make this article available via PMC and Europe PMC, consistent with existing copyright protections. |
spellingShingle | Applications Notes Warren, René L. Sutton, Granger G. Jones, Steven J. M. Holt, Robert A. Assembling millions of short DNA sequences using SSAKE |
title | Assembling millions of short DNA sequences using SSAKE |
title_full | Assembling millions of short DNA sequences using SSAKE |
title_fullStr | Assembling millions of short DNA sequences using SSAKE |
title_full_unstemmed | Assembling millions of short DNA sequences using SSAKE |
title_short | Assembling millions of short DNA sequences using SSAKE |
title_sort | assembling millions of short dna sequences using ssake |
topic | Applications Notes |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7109930/ https://www.ncbi.nlm.nih.gov/pubmed/17158514 http://dx.doi.org/10.1093/bioinformatics/btl629 |
work_keys_str_mv | AT warrenrenel assemblingmillionsofshortdnasequencesusingssake AT suttongrangerg assemblingmillionsofshortdnasequencesusingssake AT jonesstevenjm assemblingmillionsofshortdnasequencesusingssake AT holtroberta assemblingmillionsofshortdnasequencesusingssake |