Cargando…
An Optimal Seed Based Compression Algorithm for DNA Sequences
This paper proposes a seed based lossless compression algorithm to compress a DNA sequence which uses a substitution method that is similar to the LempelZiv compression scheme. The proposed method exploits the repetition structures that are inherent in DNA sequences by creating an offline dictionary...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi Publishing Corporation
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4983397/ https://www.ncbi.nlm.nih.gov/pubmed/27555868 http://dx.doi.org/10.1155/2016/3528406 |
_version_ | 1782447904434159616 |
---|---|
author | Eric, Pamela Vinitha Gopalakrishnan, Gopakumar Karunakaran, Muralikrishnan |
author_facet | Eric, Pamela Vinitha Gopalakrishnan, Gopakumar Karunakaran, Muralikrishnan |
author_sort | Eric, Pamela Vinitha |
collection | PubMed |
description | This paper proposes a seed based lossless compression algorithm to compress a DNA sequence which uses a substitution method that is similar to the LempelZiv compression scheme. The proposed method exploits the repetition structures that are inherent in DNA sequences by creating an offline dictionary which contains all such repeats along with the details of mismatches. By ensuring that only promising mismatches are allowed, the method achieves a compression ratio that is at par or better than the existing lossless DNA sequence compression algorithms. |
format | Online Article Text |
id | pubmed-4983397 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Hindawi Publishing Corporation |
record_format | MEDLINE/PubMed |
spelling | pubmed-49833972016-08-23 An Optimal Seed Based Compression Algorithm for DNA Sequences Eric, Pamela Vinitha Gopalakrishnan, Gopakumar Karunakaran, Muralikrishnan Adv Bioinformatics Research Article This paper proposes a seed based lossless compression algorithm to compress a DNA sequence which uses a substitution method that is similar to the LempelZiv compression scheme. The proposed method exploits the repetition structures that are inherent in DNA sequences by creating an offline dictionary which contains all such repeats along with the details of mismatches. By ensuring that only promising mismatches are allowed, the method achieves a compression ratio that is at par or better than the existing lossless DNA sequence compression algorithms. Hindawi Publishing Corporation 2016 2016-07-31 /pmc/articles/PMC4983397/ /pubmed/27555868 http://dx.doi.org/10.1155/2016/3528406 Text en Copyright © 2016 Pamela Vinitha Eric et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Eric, Pamela Vinitha Gopalakrishnan, Gopakumar Karunakaran, Muralikrishnan An Optimal Seed Based Compression Algorithm for DNA Sequences |
title | An Optimal Seed Based Compression Algorithm for DNA Sequences |
title_full | An Optimal Seed Based Compression Algorithm for DNA Sequences |
title_fullStr | An Optimal Seed Based Compression Algorithm for DNA Sequences |
title_full_unstemmed | An Optimal Seed Based Compression Algorithm for DNA Sequences |
title_short | An Optimal Seed Based Compression Algorithm for DNA Sequences |
title_sort | optimal seed based compression algorithm for dna sequences |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4983397/ https://www.ncbi.nlm.nih.gov/pubmed/27555868 http://dx.doi.org/10.1155/2016/3528406 |
work_keys_str_mv | AT ericpamelavinitha anoptimalseedbasedcompressionalgorithmfordnasequences AT gopalakrishnangopakumar anoptimalseedbasedcompressionalgorithmfordnasequences AT karunakaranmuralikrishnan anoptimalseedbasedcompressionalgorithmfordnasequences AT ericpamelavinitha optimalseedbasedcompressionalgorithmfordnasequences AT gopalakrishnangopakumar optimalseedbasedcompressionalgorithmfordnasequences AT karunakaranmuralikrishnan optimalseedbasedcompressionalgorithmfordnasequences |