Cargando…

Modified HuffBit Compress Algorithm – An Application of R

The databases of genomic sequences are growing at an explicative rate because of the increasing growth of living organisms. Compressing deoxyribonucleic acid (DNA) sequences is a momentous task as the databases are getting closest to its threshold. Various compression algorithms are developed for DN...

Descripción completa

Detalles Bibliográficos
Autores principales: Habib, Nahida, Ahmed, Kawsar, Jabin, Iffat, Rahman, Mohammad Motiur
Formato: Online Artículo Texto
Lenguaje:English
Publicado: De Gruyter 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6340127/
https://www.ncbi.nlm.nih.gov/pubmed/29470175
http://dx.doi.org/10.1515/jib-2017-0057
_version_ 1783388743737016320
author Habib, Nahida
Ahmed, Kawsar
Jabin, Iffat
Rahman, Mohammad Motiur
author_facet Habib, Nahida
Ahmed, Kawsar
Jabin, Iffat
Rahman, Mohammad Motiur
author_sort Habib, Nahida
collection PubMed
description The databases of genomic sequences are growing at an explicative rate because of the increasing growth of living organisms. Compressing deoxyribonucleic acid (DNA) sequences is a momentous task as the databases are getting closest to its threshold. Various compression algorithms are developed for DNA sequence compression. An efficient DNA compression algorithm that works on both repetitive and non-repetitive sequences known as “HuffBit Compress” is based on the concept of Extended Binary Tree. In this paper, here is proposed and developed a modified version of “HuffBit Compress” algorithm to compress and decompress DNA sequences using the R language which will always give the Best Case of the compression ratio but it uses extra 6 bits to compress than best case of “HuffBit Compress” algorithm and can be named as the “Modified HuffBit Compress Algorithm”. The algorithm makes an extended binary tree based on the Huffman Codes and the maximum occurring bases (A, C, G, T). Experimenting with 6 sequences the proposed algorithm gives approximately 16.18 % improvement in compression ration over the “HuffBit Compress” algorithm and 11.12 % improvement in compression ration over the “2-Bits Encoding Method”.
format Online
Article
Text
id pubmed-6340127
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher De Gruyter
record_format MEDLINE/PubMed
spelling pubmed-63401272019-01-28 Modified HuffBit Compress Algorithm – An Application of R Habib, Nahida Ahmed, Kawsar Jabin, Iffat Rahman, Mohammad Motiur J Integr Bioinform Research Articles The databases of genomic sequences are growing at an explicative rate because of the increasing growth of living organisms. Compressing deoxyribonucleic acid (DNA) sequences is a momentous task as the databases are getting closest to its threshold. Various compression algorithms are developed for DNA sequence compression. An efficient DNA compression algorithm that works on both repetitive and non-repetitive sequences known as “HuffBit Compress” is based on the concept of Extended Binary Tree. In this paper, here is proposed and developed a modified version of “HuffBit Compress” algorithm to compress and decompress DNA sequences using the R language which will always give the Best Case of the compression ratio but it uses extra 6 bits to compress than best case of “HuffBit Compress” algorithm and can be named as the “Modified HuffBit Compress Algorithm”. The algorithm makes an extended binary tree based on the Huffman Codes and the maximum occurring bases (A, C, G, T). Experimenting with 6 sequences the proposed algorithm gives approximately 16.18 % improvement in compression ration over the “HuffBit Compress” algorithm and 11.12 % improvement in compression ration over the “2-Bits Encoding Method”. De Gruyter 2018-02-22 /pmc/articles/PMC6340127/ /pubmed/29470175 http://dx.doi.org/10.1515/jib-2017-0057 Text en ©2018 Nahida Habib et al., published by De Gruyter, Berlin/Boston http://creativecommons.org/licenses/by-nc-nd/3.0 This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.
spellingShingle Research Articles
Habib, Nahida
Ahmed, Kawsar
Jabin, Iffat
Rahman, Mohammad Motiur
Modified HuffBit Compress Algorithm – An Application of R
title Modified HuffBit Compress Algorithm – An Application of R
title_full Modified HuffBit Compress Algorithm – An Application of R
title_fullStr Modified HuffBit Compress Algorithm – An Application of R
title_full_unstemmed Modified HuffBit Compress Algorithm – An Application of R
title_short Modified HuffBit Compress Algorithm – An Application of R
title_sort modified huffbit compress algorithm – an application of r
topic Research Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6340127/
https://www.ncbi.nlm.nih.gov/pubmed/29470175
http://dx.doi.org/10.1515/jib-2017-0057
work_keys_str_mv AT habibnahida modifiedhuffbitcompressalgorithmanapplicationofr
AT ahmedkawsar modifiedhuffbitcompressalgorithmanapplicationofr
AT jabiniffat modifiedhuffbitcompressalgorithmanapplicationofr
AT rahmanmohammadmotiur modifiedhuffbitcompressalgorithmanapplicationofr