Cargando…

Gap5—editing the billion fragment sequence assembly

Motivation: Existing sequence assembly editors struggle with the volumes of data now readily available from the latest generation of DNA sequencing instruments. Results: We describe the Gap5 software along with the data structures and algorithms used that allow it to be scalable. We demonstrate this...

Descripción completa

Detalles Bibliográficos
Autores principales: Bonfield, James K., Whitwham, Andrew
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2894512/
https://www.ncbi.nlm.nih.gov/pubmed/20513662
http://dx.doi.org/10.1093/bioinformatics/btq268
_version_ 1782183196680519680
author Bonfield, James K.
Whitwham, Andrew
author_facet Bonfield, James K.
Whitwham, Andrew
author_sort Bonfield, James K.
collection PubMed
description Motivation: Existing sequence assembly editors struggle with the volumes of data now readily available from the latest generation of DNA sequencing instruments. Results: We describe the Gap5 software along with the data structures and algorithms used that allow it to be scalable. We demonstrate this with an assembly of 1.1 billion sequence fragments and compare the performance with several other programs. We analyse the memory, CPU, I/O usage and file sizes used by Gap5. Availability and Implementation: Gap5 is part of the Staden Package and is available under an Open Source licence from http://staden.sourceforge.net. It is implemented in C and Tcl/Tk. Currently it works on Unix systems only. Contact: jkb@sanger.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online.
format Text
id pubmed-2894512
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-28945122010-07-01 Gap5—editing the billion fragment sequence assembly Bonfield, James K. Whitwham, Andrew Bioinformatics Original Papers Motivation: Existing sequence assembly editors struggle with the volumes of data now readily available from the latest generation of DNA sequencing instruments. Results: We describe the Gap5 software along with the data structures and algorithms used that allow it to be scalable. We demonstrate this with an assembly of 1.1 billion sequence fragments and compare the performance with several other programs. We analyse the memory, CPU, I/O usage and file sizes used by Gap5. Availability and Implementation: Gap5 is part of the Staden Package and is available under an Open Source licence from http://staden.sourceforge.net. It is implemented in C and Tcl/Tk. Currently it works on Unix systems only. Contact: jkb@sanger.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2010-07-15 2010-05-30 /pmc/articles/PMC2894512/ /pubmed/20513662 http://dx.doi.org/10.1093/bioinformatics/btq268 Text en © The Author(s) 2010. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Bonfield, James K.
Whitwham, Andrew
Gap5—editing the billion fragment sequence assembly
title Gap5—editing the billion fragment sequence assembly
title_full Gap5—editing the billion fragment sequence assembly
title_fullStr Gap5—editing the billion fragment sequence assembly
title_full_unstemmed Gap5—editing the billion fragment sequence assembly
title_short Gap5—editing the billion fragment sequence assembly
title_sort gap5—editing the billion fragment sequence assembly
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2894512/
https://www.ncbi.nlm.nih.gov/pubmed/20513662
http://dx.doi.org/10.1093/bioinformatics/btq268
work_keys_str_mv AT bonfieldjamesk gap5editingthebillionfragmentsequenceassembly
AT whitwhamandrew gap5editingthebillionfragmentsequenceassembly