Cargando…

Robust data storage in DNA by de Bruijn graph-based de novo strand assembly

DNA data storage is a rapidly developing technology with great potential due to its high density, long-term durability, and low maintenance cost. The major technical challenges include various errors, such as strand breaks, rearrangements, and indels that frequently arise during DNA synthesis, ampli...

Descripción completa

Detalles Bibliográficos
Autores principales: Song, Lifu, Geng, Feng, Gong, Zi-Yi, Chen, Xin, Tang, Jijun, Gong, Chunye, Zhou, Libang, Xia, Rui, Han, Ming-Zhe, Xu, Jing-Yi, Li, Bing-Zhi, Yuan, Ying-Jin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9468002/
https://www.ncbi.nlm.nih.gov/pubmed/36097016
http://dx.doi.org/10.1038/s41467-022-33046-w
_version_ 1784788317180526592
author Song, Lifu
Geng, Feng
Gong, Zi-Yi
Chen, Xin
Tang, Jijun
Gong, Chunye
Zhou, Libang
Xia, Rui
Han, Ming-Zhe
Xu, Jing-Yi
Li, Bing-Zhi
Yuan, Ying-Jin
author_facet Song, Lifu
Geng, Feng
Gong, Zi-Yi
Chen, Xin
Tang, Jijun
Gong, Chunye
Zhou, Libang
Xia, Rui
Han, Ming-Zhe
Xu, Jing-Yi
Li, Bing-Zhi
Yuan, Ying-Jin
author_sort Song, Lifu
collection PubMed
description DNA data storage is a rapidly developing technology with great potential due to its high density, long-term durability, and low maintenance cost. The major technical challenges include various errors, such as strand breaks, rearrangements, and indels that frequently arise during DNA synthesis, amplification, sequencing, and preservation. In this study, a de novo strand assembly algorithm (DBGPS) is developed using de Bruijn graph and greedy path search to meet these challenges. DBGPS shows substantial advantages in handling DNA breaks, rearrangements, and indels. The robustness of DBGPS is demonstrated by accelerated aging, multiple independent data retrievals, deep error-prone PCR, and large-scale simulations. Remarkably, 6.8 MB of data is accurately recovered from a severely corrupted sample that has been treated at 70 °C for 70 days. With DBGPS, we are able to achieve a logical density of 1.30 bits/cycle and a physical density of 295 PB/g.
format Online
Article
Text
id pubmed-9468002
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-94680022022-09-14 Robust data storage in DNA by de Bruijn graph-based de novo strand assembly Song, Lifu Geng, Feng Gong, Zi-Yi Chen, Xin Tang, Jijun Gong, Chunye Zhou, Libang Xia, Rui Han, Ming-Zhe Xu, Jing-Yi Li, Bing-Zhi Yuan, Ying-Jin Nat Commun Article DNA data storage is a rapidly developing technology with great potential due to its high density, long-term durability, and low maintenance cost. The major technical challenges include various errors, such as strand breaks, rearrangements, and indels that frequently arise during DNA synthesis, amplification, sequencing, and preservation. In this study, a de novo strand assembly algorithm (DBGPS) is developed using de Bruijn graph and greedy path search to meet these challenges. DBGPS shows substantial advantages in handling DNA breaks, rearrangements, and indels. The robustness of DBGPS is demonstrated by accelerated aging, multiple independent data retrievals, deep error-prone PCR, and large-scale simulations. Remarkably, 6.8 MB of data is accurately recovered from a severely corrupted sample that has been treated at 70 °C for 70 days. With DBGPS, we are able to achieve a logical density of 1.30 bits/cycle and a physical density of 295 PB/g. Nature Publishing Group UK 2022-09-12 /pmc/articles/PMC9468002/ /pubmed/36097016 http://dx.doi.org/10.1038/s41467-022-33046-w Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Song, Lifu
Geng, Feng
Gong, Zi-Yi
Chen, Xin
Tang, Jijun
Gong, Chunye
Zhou, Libang
Xia, Rui
Han, Ming-Zhe
Xu, Jing-Yi
Li, Bing-Zhi
Yuan, Ying-Jin
Robust data storage in DNA by de Bruijn graph-based de novo strand assembly
title Robust data storage in DNA by de Bruijn graph-based de novo strand assembly
title_full Robust data storage in DNA by de Bruijn graph-based de novo strand assembly
title_fullStr Robust data storage in DNA by de Bruijn graph-based de novo strand assembly
title_full_unstemmed Robust data storage in DNA by de Bruijn graph-based de novo strand assembly
title_short Robust data storage in DNA by de Bruijn graph-based de novo strand assembly
title_sort robust data storage in dna by de bruijn graph-based de novo strand assembly
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9468002/
https://www.ncbi.nlm.nih.gov/pubmed/36097016
http://dx.doi.org/10.1038/s41467-022-33046-w
work_keys_str_mv AT songlifu robustdatastorageindnabydebruijngraphbaseddenovostrandassembly
AT gengfeng robustdatastorageindnabydebruijngraphbaseddenovostrandassembly
AT gongziyi robustdatastorageindnabydebruijngraphbaseddenovostrandassembly
AT chenxin robustdatastorageindnabydebruijngraphbaseddenovostrandassembly
AT tangjijun robustdatastorageindnabydebruijngraphbaseddenovostrandassembly
AT gongchunye robustdatastorageindnabydebruijngraphbaseddenovostrandassembly
AT zhoulibang robustdatastorageindnabydebruijngraphbaseddenovostrandassembly
AT xiarui robustdatastorageindnabydebruijngraphbaseddenovostrandassembly
AT hanmingzhe robustdatastorageindnabydebruijngraphbaseddenovostrandassembly
AT xujingyi robustdatastorageindnabydebruijngraphbaseddenovostrandassembly
AT libingzhi robustdatastorageindnabydebruijngraphbaseddenovostrandassembly
AT yuanyingjin robustdatastorageindnabydebruijngraphbaseddenovostrandassembly