Cargando…
The Genomedata format for storing large-scale functional genomics data
Summary: We present a format for efficient storage of multiple tracks of numeric data anchored to a genome. The format allows fast random access to hundreds of gigabytes of data, while retaining a small disk space footprint. We have also developed utilities to load data into this format. We show tha...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2872006/ https://www.ncbi.nlm.nih.gov/pubmed/20435580 http://dx.doi.org/10.1093/bioinformatics/btq164 |
_version_ | 1782181193661284352 |
---|---|
author | Hoffman, Michael M. Buske, Orion J. Noble, William Stafford |
author_facet | Hoffman, Michael M. Buske, Orion J. Noble, William Stafford |
author_sort | Hoffman, Michael M. |
collection | PubMed |
description | Summary: We present a format for efficient storage of multiple tracks of numeric data anchored to a genome. The format allows fast random access to hundreds of gigabytes of data, while retaining a small disk space footprint. We have also developed utilities to load data into this format. We show that retrieving data from this format is more than 2900 times faster than a naive approach using wiggle files. Availability and Implementation: Reference implementation in Python and C components available at http://noble.gs.washington.edu/proj/genomedata/ under the GNU General Public License. Contact: william-noble@uw.edu |
format | Text |
id | pubmed-2872006 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2010 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-28720062010-05-24 The Genomedata format for storing large-scale functional genomics data Hoffman, Michael M. Buske, Orion J. Noble, William Stafford Bioinformatics Applications Note Summary: We present a format for efficient storage of multiple tracks of numeric data anchored to a genome. The format allows fast random access to hundreds of gigabytes of data, while retaining a small disk space footprint. We have also developed utilities to load data into this format. We show that retrieving data from this format is more than 2900 times faster than a naive approach using wiggle files. Availability and Implementation: Reference implementation in Python and C components available at http://noble.gs.washington.edu/proj/genomedata/ under the GNU General Public License. Contact: william-noble@uw.edu Oxford University Press 2010-06-01 2010-04-29 /pmc/articles/PMC2872006/ /pubmed/20435580 http://dx.doi.org/10.1093/bioinformatics/btq164 Text en © The Author 2010. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Applications Note Hoffman, Michael M. Buske, Orion J. Noble, William Stafford The Genomedata format for storing large-scale functional genomics data |
title | The Genomedata format for storing large-scale functional genomics data |
title_full | The Genomedata format for storing large-scale functional genomics data |
title_fullStr | The Genomedata format for storing large-scale functional genomics data |
title_full_unstemmed | The Genomedata format for storing large-scale functional genomics data |
title_short | The Genomedata format for storing large-scale functional genomics data |
title_sort | genomedata format for storing large-scale functional genomics data |
topic | Applications Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2872006/ https://www.ncbi.nlm.nih.gov/pubmed/20435580 http://dx.doi.org/10.1093/bioinformatics/btq164 |
work_keys_str_mv | AT hoffmanmichaelm thegenomedataformatforstoringlargescalefunctionalgenomicsdata AT buskeorionj thegenomedataformatforstoringlargescalefunctionalgenomicsdata AT noblewilliamstafford thegenomedataformatforstoringlargescalefunctionalgenomicsdata AT hoffmanmichaelm genomedataformatforstoringlargescalefunctionalgenomicsdata AT buskeorionj genomedataformatforstoringlargescalefunctionalgenomicsdata AT noblewilliamstafford genomedataformatforstoringlargescalefunctionalgenomicsdata |