Cargando…

DNABIT Compress – Genome compression algorithm

Data compression is concerned with how information is organized in data. Efficient storage means removal of redundancy from the data being stored in the DNA molecule. Data compression algorithms remove redundancy and are used to understand biologically important molecules. We present a compression a...

Descripción completa

Detalles Bibliográficos
Autores principales: Rajarajeswari, Pothuraju, Apparao, Allam
Formato: Texto
Lenguaje:English
Publicado: Biomedical Informatics 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3046040/
https://www.ncbi.nlm.nih.gov/pubmed/21383923
_version_ 1782198909358047232
author Rajarajeswari, Pothuraju
Apparao, Allam
author_facet Rajarajeswari, Pothuraju
Apparao, Allam
author_sort Rajarajeswari, Pothuraju
collection PubMed
description Data compression is concerned with how information is organized in data. Efficient storage means removal of redundancy from the data being stored in the DNA molecule. Data compression algorithms remove redundancy and are used to understand biologically important molecules. We present a compression algorithm, “DNABIT Compress” for DNA sequences based on a novel algorithm of assigning binary bits for smaller segments of DNA bases to compress both repetitive and non repetitive DNA sequence. Our proposed algorithm achieves the best compression ratio for DNA sequences for larger genome. Significantly better compression results show that “DNABIT Compress” algorithm is the best among the remaining compression algorithms. While achieving the best compression ratios for DNA sequences (Genomes),our new DNABIT Compress algorithm significantly improves the running time of all previous DNA compression programs. Assigning binary bits (Unique BIT CODE) for (Exact Repeats, Reverse Repeats) fragments of DNA sequence is also a unique concept introduced in this algorithm for the first time in DNA compression. This proposed new algorithm could achieve the best compression ratio as much as 1.58 bits/bases where the existing best methods could not achieve a ratio less than 1.72 bits/bases.
format Text
id pubmed-3046040
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Biomedical Informatics
record_format MEDLINE/PubMed
spelling pubmed-30460402011-03-07 DNABIT Compress – Genome compression algorithm Rajarajeswari, Pothuraju Apparao, Allam Bioinformation Software Data compression is concerned with how information is organized in data. Efficient storage means removal of redundancy from the data being stored in the DNA molecule. Data compression algorithms remove redundancy and are used to understand biologically important molecules. We present a compression algorithm, “DNABIT Compress” for DNA sequences based on a novel algorithm of assigning binary bits for smaller segments of DNA bases to compress both repetitive and non repetitive DNA sequence. Our proposed algorithm achieves the best compression ratio for DNA sequences for larger genome. Significantly better compression results show that “DNABIT Compress” algorithm is the best among the remaining compression algorithms. While achieving the best compression ratios for DNA sequences (Genomes),our new DNABIT Compress algorithm significantly improves the running time of all previous DNA compression programs. Assigning binary bits (Unique BIT CODE) for (Exact Repeats, Reverse Repeats) fragments of DNA sequence is also a unique concept introduced in this algorithm for the first time in DNA compression. This proposed new algorithm could achieve the best compression ratio as much as 1.58 bits/bases where the existing best methods could not achieve a ratio less than 1.72 bits/bases. Biomedical Informatics 2011-01-22 /pmc/articles/PMC3046040/ /pubmed/21383923 Text en © 2011 Biomedical Informatics Publishing Group This is an open-access article, which permits unrestricted use, distribution, and reproduction in any medium, for non-commercial purposes, provided the original author and source are credited.
spellingShingle Software
Rajarajeswari, Pothuraju
Apparao, Allam
DNABIT Compress – Genome compression algorithm
title DNABIT Compress – Genome compression algorithm
title_full DNABIT Compress – Genome compression algorithm
title_fullStr DNABIT Compress – Genome compression algorithm
title_full_unstemmed DNABIT Compress – Genome compression algorithm
title_short DNABIT Compress – Genome compression algorithm
title_sort dnabit compress – genome compression algorithm
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3046040/
https://www.ncbi.nlm.nih.gov/pubmed/21383923
work_keys_str_mv AT rajarajeswaripothuraju dnabitcompressgenomecompressionalgorithm
AT apparaoallam dnabitcompressgenomecompressionalgorithm