Cargando…
High-scale random access on DNA storage systems
Due to the rapid cost decline of synthesizing and sequencing deoxyribonucleic acid (DNA), high information density, and its durability of up to centuries, utilizing DNA as an information storage medium has received the attention of many scientists. State-of-the-art DNA storage systems exploit the hi...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8829907/ https://www.ncbi.nlm.nih.gov/pubmed/35156022 http://dx.doi.org/10.1093/nargab/lqab126 |
_version_ | 1784648165237981184 |
---|---|
author | El-Shaikh, Alex Welzel, Marius Heider, Dominik Seeger, Bernhard |
author_facet | El-Shaikh, Alex Welzel, Marius Heider, Dominik Seeger, Bernhard |
author_sort | El-Shaikh, Alex |
collection | PubMed |
description | Due to the rapid cost decline of synthesizing and sequencing deoxyribonucleic acid (DNA), high information density, and its durability of up to centuries, utilizing DNA as an information storage medium has received the attention of many scientists. State-of-the-art DNA storage systems exploit the high capacity of DNA and enable random access (predominantly random reads) by primers, which serve as unique identifiers for directly accessing data. However, primers come with a significant limitation regarding the maximum available number per DNA library. The number of different primers within a library is typically very small (e.g. ≈10). We propose a method to overcome this deficiency and present a general-purpose technique for addressing and directly accessing thousands to potentially millions of different data objects within the same DNA pool. Our approach utilizes a fountain code, sophisticated probe design, and microarray technologies. A key component is locality-sensitive hashing, making checks for dissimilarity among such a large number of probes and data objects feasible. |
format | Online Article Text |
id | pubmed-8829907 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-88299072022-02-11 High-scale random access on DNA storage systems El-Shaikh, Alex Welzel, Marius Heider, Dominik Seeger, Bernhard NAR Genom Bioinform High Throughput Sequencing Methods Due to the rapid cost decline of synthesizing and sequencing deoxyribonucleic acid (DNA), high information density, and its durability of up to centuries, utilizing DNA as an information storage medium has received the attention of many scientists. State-of-the-art DNA storage systems exploit the high capacity of DNA and enable random access (predominantly random reads) by primers, which serve as unique identifiers for directly accessing data. However, primers come with a significant limitation regarding the maximum available number per DNA library. The number of different primers within a library is typically very small (e.g. ≈10). We propose a method to overcome this deficiency and present a general-purpose technique for addressing and directly accessing thousands to potentially millions of different data objects within the same DNA pool. Our approach utilizes a fountain code, sophisticated probe design, and microarray technologies. A key component is locality-sensitive hashing, making checks for dissimilarity among such a large number of probes and data objects feasible. Oxford University Press 2022-01-14 /pmc/articles/PMC8829907/ /pubmed/35156022 http://dx.doi.org/10.1093/nargab/lqab126 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | High Throughput Sequencing Methods El-Shaikh, Alex Welzel, Marius Heider, Dominik Seeger, Bernhard High-scale random access on DNA storage systems |
title | High-scale random access on DNA storage systems |
title_full | High-scale random access on DNA storage systems |
title_fullStr | High-scale random access on DNA storage systems |
title_full_unstemmed | High-scale random access on DNA storage systems |
title_short | High-scale random access on DNA storage systems |
title_sort | high-scale random access on dna storage systems |
topic | High Throughput Sequencing Methods |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8829907/ https://www.ncbi.nlm.nih.gov/pubmed/35156022 http://dx.doi.org/10.1093/nargab/lqab126 |
work_keys_str_mv | AT elshaikhalex highscalerandomaccessondnastoragesystems AT welzelmarius highscalerandomaccessondnastoragesystems AT heiderdominik highscalerandomaccessondnastoragesystems AT seegerbernhard highscalerandomaccessondnastoragesystems |