Cargando…

The Essential Component in DNA-Based Information Storage System: Robust Error-Tolerating Module

The size of digital data is ever increasing and is expected to grow to 40,000 EB by 2020, yet the estimated global information storage capacity in 2011 is <300 EB, indicating that most of the data are transient. DNA, as a very stable nano-molecule, is an ideal massive storage device for long-term...

Descripción completa

Detalles Bibliográficos
Autores principales: Yim, Aldrin Kay-Yuen, Yu, Allen Chi-Shing, Li, Jing-Woei, Wong, Ada In-Chun, Loo, Jacky F. C., Chan, King Ming, Kong, S. K., Yip, Kevin Y., Chan, Ting-Fung
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4222239/
https://www.ncbi.nlm.nih.gov/pubmed/25414846
http://dx.doi.org/10.3389/fbioe.2014.00049
_version_ 1782343000865636352
author Yim, Aldrin Kay-Yuen
Yu, Allen Chi-Shing
Li, Jing-Woei
Wong, Ada In-Chun
Loo, Jacky F. C.
Chan, King Ming
Kong, S. K.
Yip, Kevin Y.
Chan, Ting-Fung
author_facet Yim, Aldrin Kay-Yuen
Yu, Allen Chi-Shing
Li, Jing-Woei
Wong, Ada In-Chun
Loo, Jacky F. C.
Chan, King Ming
Kong, S. K.
Yip, Kevin Y.
Chan, Ting-Fung
author_sort Yim, Aldrin Kay-Yuen
collection PubMed
description The size of digital data is ever increasing and is expected to grow to 40,000 EB by 2020, yet the estimated global information storage capacity in 2011 is <300 EB, indicating that most of the data are transient. DNA, as a very stable nano-molecule, is an ideal massive storage device for long-term data archive. The two most notable illustrations are from Church et al. and Goldman et al., whose approaches are well-optimized for most sequencing platforms – short synthesized DNA fragments without homopolymer. Here, we suggested improvements on error handling methodology that could enable the integration of DNA-based computational process, e.g., algorithms based on self-assembly of DNA. As a proof of concept, a picture of size 438 bytes was encoded to DNA with low-density parity-check error-correction code. We salvaged a significant portion of sequencing reads with mutations generated during DNA synthesis and sequencing and successfully reconstructed the entire picture. A modular-based programing framework – DNAcodec with an eXtensible Markup Language-based data format was also introduced. Our experiments demonstrated the practicability of long DNA message recovery with high error tolerance, which opens the field to biocomputing and synthetic biology.
format Online
Article
Text
id pubmed-4222239
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-42222392014-11-20 The Essential Component in DNA-Based Information Storage System: Robust Error-Tolerating Module Yim, Aldrin Kay-Yuen Yu, Allen Chi-Shing Li, Jing-Woei Wong, Ada In-Chun Loo, Jacky F. C. Chan, King Ming Kong, S. K. Yip, Kevin Y. Chan, Ting-Fung Front Bioeng Biotechnol Bioengineering and Biotechnology The size of digital data is ever increasing and is expected to grow to 40,000 EB by 2020, yet the estimated global information storage capacity in 2011 is <300 EB, indicating that most of the data are transient. DNA, as a very stable nano-molecule, is an ideal massive storage device for long-term data archive. The two most notable illustrations are from Church et al. and Goldman et al., whose approaches are well-optimized for most sequencing platforms – short synthesized DNA fragments without homopolymer. Here, we suggested improvements on error handling methodology that could enable the integration of DNA-based computational process, e.g., algorithms based on self-assembly of DNA. As a proof of concept, a picture of size 438 bytes was encoded to DNA with low-density parity-check error-correction code. We salvaged a significant portion of sequencing reads with mutations generated during DNA synthesis and sequencing and successfully reconstructed the entire picture. A modular-based programing framework – DNAcodec with an eXtensible Markup Language-based data format was also introduced. Our experiments demonstrated the practicability of long DNA message recovery with high error tolerance, which opens the field to biocomputing and synthetic biology. Frontiers Media S.A. 2014-11-06 /pmc/articles/PMC4222239/ /pubmed/25414846 http://dx.doi.org/10.3389/fbioe.2014.00049 Text en Copyright © 2014 Yim, Yu, Li, Wong, Loo, Chan, Kong, Yip and Chan. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Bioengineering and Biotechnology
Yim, Aldrin Kay-Yuen
Yu, Allen Chi-Shing
Li, Jing-Woei
Wong, Ada In-Chun
Loo, Jacky F. C.
Chan, King Ming
Kong, S. K.
Yip, Kevin Y.
Chan, Ting-Fung
The Essential Component in DNA-Based Information Storage System: Robust Error-Tolerating Module
title The Essential Component in DNA-Based Information Storage System: Robust Error-Tolerating Module
title_full The Essential Component in DNA-Based Information Storage System: Robust Error-Tolerating Module
title_fullStr The Essential Component in DNA-Based Information Storage System: Robust Error-Tolerating Module
title_full_unstemmed The Essential Component in DNA-Based Information Storage System: Robust Error-Tolerating Module
title_short The Essential Component in DNA-Based Information Storage System: Robust Error-Tolerating Module
title_sort essential component in dna-based information storage system: robust error-tolerating module
topic Bioengineering and Biotechnology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4222239/
https://www.ncbi.nlm.nih.gov/pubmed/25414846
http://dx.doi.org/10.3389/fbioe.2014.00049
work_keys_str_mv AT yimaldrinkayyuen theessentialcomponentindnabasedinformationstoragesystemrobusterrortoleratingmodule
AT yuallenchishing theessentialcomponentindnabasedinformationstoragesystemrobusterrortoleratingmodule
AT lijingwoei theessentialcomponentindnabasedinformationstoragesystemrobusterrortoleratingmodule
AT wongadainchun theessentialcomponentindnabasedinformationstoragesystemrobusterrortoleratingmodule
AT loojackyfc theessentialcomponentindnabasedinformationstoragesystemrobusterrortoleratingmodule
AT chankingming theessentialcomponentindnabasedinformationstoragesystemrobusterrortoleratingmodule
AT kongsk theessentialcomponentindnabasedinformationstoragesystemrobusterrortoleratingmodule
AT yipkeviny theessentialcomponentindnabasedinformationstoragesystemrobusterrortoleratingmodule
AT chantingfung theessentialcomponentindnabasedinformationstoragesystemrobusterrortoleratingmodule
AT yimaldrinkayyuen essentialcomponentindnabasedinformationstoragesystemrobusterrortoleratingmodule
AT yuallenchishing essentialcomponentindnabasedinformationstoragesystemrobusterrortoleratingmodule
AT lijingwoei essentialcomponentindnabasedinformationstoragesystemrobusterrortoleratingmodule
AT wongadainchun essentialcomponentindnabasedinformationstoragesystemrobusterrortoleratingmodule
AT loojackyfc essentialcomponentindnabasedinformationstoragesystemrobusterrortoleratingmodule
AT chankingming essentialcomponentindnabasedinformationstoragesystemrobusterrortoleratingmodule
AT kongsk essentialcomponentindnabasedinformationstoragesystemrobusterrortoleratingmodule
AT yipkeviny essentialcomponentindnabasedinformationstoragesystemrobusterrortoleratingmodule
AT chantingfung essentialcomponentindnabasedinformationstoragesystemrobusterrortoleratingmodule