Cargando…
Robust data storage in DNA by de Bruijn graph-based de novo strand assembly
DNA data storage is a rapidly developing technology with great potential due to its high density, long-term durability, and low maintenance cost. The major technical challenges include various errors, such as strand breaks, rearrangements, and indels that frequently arise during DNA synthesis, ampli...
Autores principales: | , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9468002/ https://www.ncbi.nlm.nih.gov/pubmed/36097016 http://dx.doi.org/10.1038/s41467-022-33046-w |
_version_ | 1784788317180526592 |
---|---|
author | Song, Lifu Geng, Feng Gong, Zi-Yi Chen, Xin Tang, Jijun Gong, Chunye Zhou, Libang Xia, Rui Han, Ming-Zhe Xu, Jing-Yi Li, Bing-Zhi Yuan, Ying-Jin |
author_facet | Song, Lifu Geng, Feng Gong, Zi-Yi Chen, Xin Tang, Jijun Gong, Chunye Zhou, Libang Xia, Rui Han, Ming-Zhe Xu, Jing-Yi Li, Bing-Zhi Yuan, Ying-Jin |
author_sort | Song, Lifu |
collection | PubMed |
description | DNA data storage is a rapidly developing technology with great potential due to its high density, long-term durability, and low maintenance cost. The major technical challenges include various errors, such as strand breaks, rearrangements, and indels that frequently arise during DNA synthesis, amplification, sequencing, and preservation. In this study, a de novo strand assembly algorithm (DBGPS) is developed using de Bruijn graph and greedy path search to meet these challenges. DBGPS shows substantial advantages in handling DNA breaks, rearrangements, and indels. The robustness of DBGPS is demonstrated by accelerated aging, multiple independent data retrievals, deep error-prone PCR, and large-scale simulations. Remarkably, 6.8 MB of data is accurately recovered from a severely corrupted sample that has been treated at 70 °C for 70 days. With DBGPS, we are able to achieve a logical density of 1.30 bits/cycle and a physical density of 295 PB/g. |
format | Online Article Text |
id | pubmed-9468002 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-94680022022-09-14 Robust data storage in DNA by de Bruijn graph-based de novo strand assembly Song, Lifu Geng, Feng Gong, Zi-Yi Chen, Xin Tang, Jijun Gong, Chunye Zhou, Libang Xia, Rui Han, Ming-Zhe Xu, Jing-Yi Li, Bing-Zhi Yuan, Ying-Jin Nat Commun Article DNA data storage is a rapidly developing technology with great potential due to its high density, long-term durability, and low maintenance cost. The major technical challenges include various errors, such as strand breaks, rearrangements, and indels that frequently arise during DNA synthesis, amplification, sequencing, and preservation. In this study, a de novo strand assembly algorithm (DBGPS) is developed using de Bruijn graph and greedy path search to meet these challenges. DBGPS shows substantial advantages in handling DNA breaks, rearrangements, and indels. The robustness of DBGPS is demonstrated by accelerated aging, multiple independent data retrievals, deep error-prone PCR, and large-scale simulations. Remarkably, 6.8 MB of data is accurately recovered from a severely corrupted sample that has been treated at 70 °C for 70 days. With DBGPS, we are able to achieve a logical density of 1.30 bits/cycle and a physical density of 295 PB/g. Nature Publishing Group UK 2022-09-12 /pmc/articles/PMC9468002/ /pubmed/36097016 http://dx.doi.org/10.1038/s41467-022-33046-w Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Song, Lifu Geng, Feng Gong, Zi-Yi Chen, Xin Tang, Jijun Gong, Chunye Zhou, Libang Xia, Rui Han, Ming-Zhe Xu, Jing-Yi Li, Bing-Zhi Yuan, Ying-Jin Robust data storage in DNA by de Bruijn graph-based de novo strand assembly |
title | Robust data storage in DNA by de Bruijn graph-based de novo strand assembly |
title_full | Robust data storage in DNA by de Bruijn graph-based de novo strand assembly |
title_fullStr | Robust data storage in DNA by de Bruijn graph-based de novo strand assembly |
title_full_unstemmed | Robust data storage in DNA by de Bruijn graph-based de novo strand assembly |
title_short | Robust data storage in DNA by de Bruijn graph-based de novo strand assembly |
title_sort | robust data storage in dna by de bruijn graph-based de novo strand assembly |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9468002/ https://www.ncbi.nlm.nih.gov/pubmed/36097016 http://dx.doi.org/10.1038/s41467-022-33046-w |
work_keys_str_mv | AT songlifu robustdatastorageindnabydebruijngraphbaseddenovostrandassembly AT gengfeng robustdatastorageindnabydebruijngraphbaseddenovostrandassembly AT gongziyi robustdatastorageindnabydebruijngraphbaseddenovostrandassembly AT chenxin robustdatastorageindnabydebruijngraphbaseddenovostrandassembly AT tangjijun robustdatastorageindnabydebruijngraphbaseddenovostrandassembly AT gongchunye robustdatastorageindnabydebruijngraphbaseddenovostrandassembly AT zhoulibang robustdatastorageindnabydebruijngraphbaseddenovostrandassembly AT xiarui robustdatastorageindnabydebruijngraphbaseddenovostrandassembly AT hanmingzhe robustdatastorageindnabydebruijngraphbaseddenovostrandassembly AT xujingyi robustdatastorageindnabydebruijngraphbaseddenovostrandassembly AT libingzhi robustdatastorageindnabydebruijngraphbaseddenovostrandassembly AT yuanyingjin robustdatastorageindnabydebruijngraphbaseddenovostrandassembly |