Cargando…

BioCode: Two biologically compatible Algorithms for embedding data in non-coding and coding regions of DNA

BACKGROUND: In recent times, the application of deoxyribonucleic acid (DNA) has diversified with the emergence of fields such as DNA computing and DNA data embedding. DNA data embedding, also known as DNA watermarking or DNA steganography, aims to develop robust algorithms for encoding non-genetic i...

Descripción completa

Detalles Bibliográficos
Autores principales: Haughton, David, Balado, Félix
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3698116/
https://www.ncbi.nlm.nih.gov/pubmed/23570444
http://dx.doi.org/10.1186/1471-2105-14-121
_version_ 1782275245022904320
author Haughton, David
Balado, Félix
author_facet Haughton, David
Balado, Félix
author_sort Haughton, David
collection PubMed
description BACKGROUND: In recent times, the application of deoxyribonucleic acid (DNA) has diversified with the emergence of fields such as DNA computing and DNA data embedding. DNA data embedding, also known as DNA watermarking or DNA steganography, aims to develop robust algorithms for encoding non-genetic information in DNA. Inherently DNA is a digital medium whereby the nucleotide bases act as digital symbols, a fact which underpins all bioinformatics techniques, and which also makes trivial information encoding using DNA straightforward. However, the situation is more complex in methods which aim at embedding information in the genomes of living organisms. DNA is susceptible to mutations, which act as a noisy channel from the point of view of information encoded using DNA. This means that the DNA data embedding field is closely related to digital communications. Moreover it is a particularly unique digital communications area, because important biological constraints must be observed by all methods. Many DNA data embedding algorithms have been presented to date, all of which operate in one of two regions: non-coding DNA (ncDNA) or protein-coding DNA (pcDNA). RESULTS: This paper proposes two novel DNA data embedding algorithms jointly called BioCode, which operate in ncDNA and pcDNA, respectively, and which comply fully with stricter biological restrictions. Existing methods comply with some elementary biological constraints, such as preserving protein translation in pcDNA. However there exist further biological restrictions which no DNA data embedding methods to date account for. Observing these constraints is key to increasing the biocompatibility and in turn, the robustness of information encoded in DNA. CONCLUSION: The algorithms encode information in near optimal ways from a coding point of view, as we demonstrate by means of theoretical and empirical (in silico) analyses. Also, they are shown to encode information in a robust way, such that mutations have isolated effects. Furthermore, the preservation of codon statistics, while achieving a near-optimum embedding rate, implies that BioCode pcDNA is also a near-optimum first-order steganographic method.
format Online
Article
Text
id pubmed-3698116
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-36981162013-07-02 BioCode: Two biologically compatible Algorithms for embedding data in non-coding and coding regions of DNA Haughton, David Balado, Félix BMC Bioinformatics Research Article BACKGROUND: In recent times, the application of deoxyribonucleic acid (DNA) has diversified with the emergence of fields such as DNA computing and DNA data embedding. DNA data embedding, also known as DNA watermarking or DNA steganography, aims to develop robust algorithms for encoding non-genetic information in DNA. Inherently DNA is a digital medium whereby the nucleotide bases act as digital symbols, a fact which underpins all bioinformatics techniques, and which also makes trivial information encoding using DNA straightforward. However, the situation is more complex in methods which aim at embedding information in the genomes of living organisms. DNA is susceptible to mutations, which act as a noisy channel from the point of view of information encoded using DNA. This means that the DNA data embedding field is closely related to digital communications. Moreover it is a particularly unique digital communications area, because important biological constraints must be observed by all methods. Many DNA data embedding algorithms have been presented to date, all of which operate in one of two regions: non-coding DNA (ncDNA) or protein-coding DNA (pcDNA). RESULTS: This paper proposes two novel DNA data embedding algorithms jointly called BioCode, which operate in ncDNA and pcDNA, respectively, and which comply fully with stricter biological restrictions. Existing methods comply with some elementary biological constraints, such as preserving protein translation in pcDNA. However there exist further biological restrictions which no DNA data embedding methods to date account for. Observing these constraints is key to increasing the biocompatibility and in turn, the robustness of information encoded in DNA. CONCLUSION: The algorithms encode information in near optimal ways from a coding point of view, as we demonstrate by means of theoretical and empirical (in silico) analyses. Also, they are shown to encode information in a robust way, such that mutations have isolated effects. Furthermore, the preservation of codon statistics, while achieving a near-optimum embedding rate, implies that BioCode pcDNA is also a near-optimum first-order steganographic method. BioMed Central 2013-04-09 /pmc/articles/PMC3698116/ /pubmed/23570444 http://dx.doi.org/10.1186/1471-2105-14-121 Text en Copyright © 2013 Haughton and Balado; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Haughton, David
Balado, Félix
BioCode: Two biologically compatible Algorithms for embedding data in non-coding and coding regions of DNA
title BioCode: Two biologically compatible Algorithms for embedding data in non-coding and coding regions of DNA
title_full BioCode: Two biologically compatible Algorithms for embedding data in non-coding and coding regions of DNA
title_fullStr BioCode: Two biologically compatible Algorithms for embedding data in non-coding and coding regions of DNA
title_full_unstemmed BioCode: Two biologically compatible Algorithms for embedding data in non-coding and coding regions of DNA
title_short BioCode: Two biologically compatible Algorithms for embedding data in non-coding and coding regions of DNA
title_sort biocode: two biologically compatible algorithms for embedding data in non-coding and coding regions of dna
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3698116/
https://www.ncbi.nlm.nih.gov/pubmed/23570444
http://dx.doi.org/10.1186/1471-2105-14-121
work_keys_str_mv AT haughtondavid biocodetwobiologicallycompatiblealgorithmsforembeddingdatainnoncodingandcodingregionsofdna
AT baladofelix biocodetwobiologicallycompatiblealgorithmsforembeddingdatainnoncodingandcodingregionsofdna