Cargando…

Data compression for sequencing data

Post-Sanger sequencing methods produce tons of data, and there is a general agreement that the challenge to store and process them must be addressed with data compression. In this review we first answer the question “why compression” in a quantitative manner. Then we also answer the questions “what”...

Descripción completa

Detalles Bibliográficos
Autores principales: Deorowicz, Sebastian, Grabowski, Szymon
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3868316/
https://www.ncbi.nlm.nih.gov/pubmed/24252160
http://dx.doi.org/10.1186/1748-7188-8-25
_version_ 1782296442713407488
author Deorowicz, Sebastian
Grabowski, Szymon
author_facet Deorowicz, Sebastian
Grabowski, Szymon
author_sort Deorowicz, Sebastian
collection PubMed
description Post-Sanger sequencing methods produce tons of data, and there is a general agreement that the challenge to store and process them must be addressed with data compression. In this review we first answer the question “why compression” in a quantitative manner. Then we also answer the questions “what” and “how”, by sketching the fundamental compression ideas, describing the main sequencing data types and formats, and comparing the specialized compression algorithms and tools. Finally, we go back to the question “why compression” and give other, perhaps surprising answers, demonstrating the pervasiveness of data compression techniques in computational biology.
format Online
Article
Text
id pubmed-3868316
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-38683162013-12-20 Data compression for sequencing data Deorowicz, Sebastian Grabowski, Szymon Algorithms Mol Biol Review Article Post-Sanger sequencing methods produce tons of data, and there is a general agreement that the challenge to store and process them must be addressed with data compression. In this review we first answer the question “why compression” in a quantitative manner. Then we also answer the questions “what” and “how”, by sketching the fundamental compression ideas, describing the main sequencing data types and formats, and comparing the specialized compression algorithms and tools. Finally, we go back to the question “why compression” and give other, perhaps surprising answers, demonstrating the pervasiveness of data compression techniques in computational biology. BioMed Central 2013-11-19 /pmc/articles/PMC3868316/ /pubmed/24252160 http://dx.doi.org/10.1186/1748-7188-8-25 Text en Copyright © 2013 Deorowicz and Grabowski; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Review Article
Deorowicz, Sebastian
Grabowski, Szymon
Data compression for sequencing data
title Data compression for sequencing data
title_full Data compression for sequencing data
title_fullStr Data compression for sequencing data
title_full_unstemmed Data compression for sequencing data
title_short Data compression for sequencing data
title_sort data compression for sequencing data
topic Review Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3868316/
https://www.ncbi.nlm.nih.gov/pubmed/24252160
http://dx.doi.org/10.1186/1748-7188-8-25
work_keys_str_mv AT deorowiczsebastian datacompressionforsequencingdata
AT grabowskiszymon datacompressionforsequencingdata