Cargando…

High capacity DNA data storage with variable-length Oligonucleotides using repeat accumulate code and hybrid mapping

BACKGROUND: With the inherent high density and durable preservation, DNA has been recently recognized as a distinguished medium to store enormous data over millennia. To overcome the limitations existing in a recently reported high-capacity DNA data storage while achieving a competitive information...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Yixin, Noor-A-Rahim, Md, Zhang, Jingyun, Gunawan, Erry, Guan, Yong Liang, Poh, Chueh Loo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6868767/
https://www.ncbi.nlm.nih.gov/pubmed/31832092
http://dx.doi.org/10.1186/s13036-019-0211-2
_version_ 1783472338842419200
author Wang, Yixin
Noor-A-Rahim, Md
Zhang, Jingyun
Gunawan, Erry
Guan, Yong Liang
Poh, Chueh Loo
author_facet Wang, Yixin
Noor-A-Rahim, Md
Zhang, Jingyun
Gunawan, Erry
Guan, Yong Liang
Poh, Chueh Loo
author_sort Wang, Yixin
collection PubMed
description BACKGROUND: With the inherent high density and durable preservation, DNA has been recently recognized as a distinguished medium to store enormous data over millennia. To overcome the limitations existing in a recently reported high-capacity DNA data storage while achieving a competitive information capacity, we are inspired to explore a new coding system that facilitates the practical implementation of DNA data storage with high capacity. RESULT: In this work, we devised and implemented a DNA data storage scheme with variable-length oligonucleotides (oligos), where a hybrid DNA mapping scheme that converts digital data to DNA records is introduced. The encoded DNA oligos stores 1.98 bits per nucleotide (bits/nt) on average (approaching the upper bound of 2 bits/nt), while conforming to the biochemical constraints. Beyond that, an oligo-level repeat-accumulate coding scheme is employed for addressing data loss and corruption in the biochemical processes. With a wet-lab experiment, an error-free retrieval of 379.1 KB data with a minimum coverage of 10x is achieved, validating the error resilience of the proposed coding scheme. Along with that, the theoretical analysis shows that the proposed scheme exhibits a net information density (user bits per nucleotide) of 1.67 bits/nt while achieving 91% of the information capacity. CONCLUSION: To advance towards practical implementations of DNA storage, we proposed and tested a DNA data storage system enabling high potential mapping (bits to nucleotide conversion) scheme and low redundancy but highly efficient error correction code design. The advancement reported would move us closer to achieving a practical high-capacity DNA data storage system.
format Online
Article
Text
id pubmed-6868767
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-68687672019-12-12 High capacity DNA data storage with variable-length Oligonucleotides using repeat accumulate code and hybrid mapping Wang, Yixin Noor-A-Rahim, Md Zhang, Jingyun Gunawan, Erry Guan, Yong Liang Poh, Chueh Loo J Biol Eng Research BACKGROUND: With the inherent high density and durable preservation, DNA has been recently recognized as a distinguished medium to store enormous data over millennia. To overcome the limitations existing in a recently reported high-capacity DNA data storage while achieving a competitive information capacity, we are inspired to explore a new coding system that facilitates the practical implementation of DNA data storage with high capacity. RESULT: In this work, we devised and implemented a DNA data storage scheme with variable-length oligonucleotides (oligos), where a hybrid DNA mapping scheme that converts digital data to DNA records is introduced. The encoded DNA oligos stores 1.98 bits per nucleotide (bits/nt) on average (approaching the upper bound of 2 bits/nt), while conforming to the biochemical constraints. Beyond that, an oligo-level repeat-accumulate coding scheme is employed for addressing data loss and corruption in the biochemical processes. With a wet-lab experiment, an error-free retrieval of 379.1 KB data with a minimum coverage of 10x is achieved, validating the error resilience of the proposed coding scheme. Along with that, the theoretical analysis shows that the proposed scheme exhibits a net information density (user bits per nucleotide) of 1.67 bits/nt while achieving 91% of the information capacity. CONCLUSION: To advance towards practical implementations of DNA storage, we proposed and tested a DNA data storage system enabling high potential mapping (bits to nucleotide conversion) scheme and low redundancy but highly efficient error correction code design. The advancement reported would move us closer to achieving a practical high-capacity DNA data storage system. BioMed Central 2019-11-21 /pmc/articles/PMC6868767/ /pubmed/31832092 http://dx.doi.org/10.1186/s13036-019-0211-2 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Wang, Yixin
Noor-A-Rahim, Md
Zhang, Jingyun
Gunawan, Erry
Guan, Yong Liang
Poh, Chueh Loo
High capacity DNA data storage with variable-length Oligonucleotides using repeat accumulate code and hybrid mapping
title High capacity DNA data storage with variable-length Oligonucleotides using repeat accumulate code and hybrid mapping
title_full High capacity DNA data storage with variable-length Oligonucleotides using repeat accumulate code and hybrid mapping
title_fullStr High capacity DNA data storage with variable-length Oligonucleotides using repeat accumulate code and hybrid mapping
title_full_unstemmed High capacity DNA data storage with variable-length Oligonucleotides using repeat accumulate code and hybrid mapping
title_short High capacity DNA data storage with variable-length Oligonucleotides using repeat accumulate code and hybrid mapping
title_sort high capacity dna data storage with variable-length oligonucleotides using repeat accumulate code and hybrid mapping
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6868767/
https://www.ncbi.nlm.nih.gov/pubmed/31832092
http://dx.doi.org/10.1186/s13036-019-0211-2
work_keys_str_mv AT wangyixin highcapacitydnadatastoragewithvariablelengtholigonucleotidesusingrepeataccumulatecodeandhybridmapping
AT noorarahimmd highcapacitydnadatastoragewithvariablelengtholigonucleotidesusingrepeataccumulatecodeandhybridmapping
AT zhangjingyun highcapacitydnadatastoragewithvariablelengtholigonucleotidesusingrepeataccumulatecodeandhybridmapping
AT gunawanerry highcapacitydnadatastoragewithvariablelengtholigonucleotidesusingrepeataccumulatecodeandhybridmapping
AT guanyongliang highcapacitydnadatastoragewithvariablelengtholigonucleotidesusingrepeataccumulatecodeandhybridmapping
AT pohchuehloo highcapacitydnadatastoragewithvariablelengtholigonucleotidesusingrepeataccumulatecodeandhybridmapping