Cargando…

Hidden Addressing Encoding for DNA Storage

DNA is a natural storage medium with the advantages of high storage density and long service life compared with traditional media. DNA storage can meet the current storage requirements for massive data. Owing to the limitations of the DNA storage technology, the data need to be converted into short...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Penghao, Mu, Ziniu, Sun, Lijun, Si, Shuqing, Wang, Bin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9344065/
https://www.ncbi.nlm.nih.gov/pubmed/35928958
http://dx.doi.org/10.3389/fbioe.2022.916615
_version_ 1784761134735163392
author Wang, Penghao
Mu, Ziniu
Sun, Lijun
Si, Shuqing
Wang, Bin
author_facet Wang, Penghao
Mu, Ziniu
Sun, Lijun
Si, Shuqing
Wang, Bin
author_sort Wang, Penghao
collection PubMed
description DNA is a natural storage medium with the advantages of high storage density and long service life compared with traditional media. DNA storage can meet the current storage requirements for massive data. Owing to the limitations of the DNA storage technology, the data need to be converted into short DNA sequences for storage. However, in the process, a large amount of physical redundancy will be generated to index short DNA sequences. To reduce redundancy, this study proposes a DNA storage encoding scheme with hidden addressing. Using the improved fountain encoding scheme, the index replaces part of the data to realize hidden addresses, and then, a 10.1 MB file is encoded with the hidden addressing. First, the Dottup dot plot generator and the Jaccard similarity coefficient analyze the overall self-similarity of the encoding sequence index, and then the sequence fragments of GC content are used to verify the performance of this scheme. The final results show that the encoding scheme indexes with overall lower self-similarity, and the local thermodynamic properties of the sequence are better. The hidden addressing encoding scheme proposed can not only improve the utilization of bases but also ensure the correct rate of DNA storage during the sequencing and decoding processes.
format Online
Article
Text
id pubmed-9344065
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-93440652022-08-03 Hidden Addressing Encoding for DNA Storage Wang, Penghao Mu, Ziniu Sun, Lijun Si, Shuqing Wang, Bin Front Bioeng Biotechnol Bioengineering and Biotechnology DNA is a natural storage medium with the advantages of high storage density and long service life compared with traditional media. DNA storage can meet the current storage requirements for massive data. Owing to the limitations of the DNA storage technology, the data need to be converted into short DNA sequences for storage. However, in the process, a large amount of physical redundancy will be generated to index short DNA sequences. To reduce redundancy, this study proposes a DNA storage encoding scheme with hidden addressing. Using the improved fountain encoding scheme, the index replaces part of the data to realize hidden addresses, and then, a 10.1 MB file is encoded with the hidden addressing. First, the Dottup dot plot generator and the Jaccard similarity coefficient analyze the overall self-similarity of the encoding sequence index, and then the sequence fragments of GC content are used to verify the performance of this scheme. The final results show that the encoding scheme indexes with overall lower self-similarity, and the local thermodynamic properties of the sequence are better. The hidden addressing encoding scheme proposed can not only improve the utilization of bases but also ensure the correct rate of DNA storage during the sequencing and decoding processes. Frontiers Media S.A. 2022-07-19 /pmc/articles/PMC9344065/ /pubmed/35928958 http://dx.doi.org/10.3389/fbioe.2022.916615 Text en Copyright © 2022 Wang, Mu, Sun, Si and Wang. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Bioengineering and Biotechnology
Wang, Penghao
Mu, Ziniu
Sun, Lijun
Si, Shuqing
Wang, Bin
Hidden Addressing Encoding for DNA Storage
title Hidden Addressing Encoding for DNA Storage
title_full Hidden Addressing Encoding for DNA Storage
title_fullStr Hidden Addressing Encoding for DNA Storage
title_full_unstemmed Hidden Addressing Encoding for DNA Storage
title_short Hidden Addressing Encoding for DNA Storage
title_sort hidden addressing encoding for dna storage
topic Bioengineering and Biotechnology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9344065/
https://www.ncbi.nlm.nih.gov/pubmed/35928958
http://dx.doi.org/10.3389/fbioe.2022.916615
work_keys_str_mv AT wangpenghao hiddenaddressingencodingfordnastorage
AT muziniu hiddenaddressingencodingfordnastorage
AT sunlijun hiddenaddressingencodingfordnastorage
AT sishuqing hiddenaddressingencodingfordnastorage
AT wangbin hiddenaddressingencodingfordnastorage