Cargando…
Molecular-level similarity search brings computing to DNA data storage
As global demand for digital storage capacity grows, storage technologies based on synthetic DNA have emerged as a dense and durable alternative to traditional media. Existing approaches leverage robust error correcting codes and precise molecular mechanisms to reliably retrieve specific files from...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8346626/ https://www.ncbi.nlm.nih.gov/pubmed/34362913 http://dx.doi.org/10.1038/s41467-021-24991-z |
_version_ | 1783734917131141120 |
---|---|
author | Bee, Callista Chen, Yuan-Jyue Queen, Melissa Ward, David Liu, Xiaomeng Organick, Lee Seelig, Georg Strauss, Karin Ceze, Luis |
author_facet | Bee, Callista Chen, Yuan-Jyue Queen, Melissa Ward, David Liu, Xiaomeng Organick, Lee Seelig, Georg Strauss, Karin Ceze, Luis |
author_sort | Bee, Callista |
collection | PubMed |
description | As global demand for digital storage capacity grows, storage technologies based on synthetic DNA have emerged as a dense and durable alternative to traditional media. Existing approaches leverage robust error correcting codes and precise molecular mechanisms to reliably retrieve specific files from large databases. Typically, files are retrieved using a pre-specified key, analogous to a filename. However, these approaches lack the ability to perform more complex computations over the stored data, such as similarity search: e.g., finding images that look similar to an image of interest without prior knowledge of their file names. Here we demonstrate a technique for executing similarity search over a DNA-based database of 1.6 million images. Queries are implemented as hybridization probes, and a key step in our approach was to learn an image-to-sequence encoding ensuring that queries preferentially bind to targets representing visually similar images. Experimental results show that our molecular implementation performs comparably to state-of-the-art in silico algorithms for similarity search. |
format | Online Article Text |
id | pubmed-8346626 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-83466262021-08-20 Molecular-level similarity search brings computing to DNA data storage Bee, Callista Chen, Yuan-Jyue Queen, Melissa Ward, David Liu, Xiaomeng Organick, Lee Seelig, Georg Strauss, Karin Ceze, Luis Nat Commun Article As global demand for digital storage capacity grows, storage technologies based on synthetic DNA have emerged as a dense and durable alternative to traditional media. Existing approaches leverage robust error correcting codes and precise molecular mechanisms to reliably retrieve specific files from large databases. Typically, files are retrieved using a pre-specified key, analogous to a filename. However, these approaches lack the ability to perform more complex computations over the stored data, such as similarity search: e.g., finding images that look similar to an image of interest without prior knowledge of their file names. Here we demonstrate a technique for executing similarity search over a DNA-based database of 1.6 million images. Queries are implemented as hybridization probes, and a key step in our approach was to learn an image-to-sequence encoding ensuring that queries preferentially bind to targets representing visually similar images. Experimental results show that our molecular implementation performs comparably to state-of-the-art in silico algorithms for similarity search. Nature Publishing Group UK 2021-08-06 /pmc/articles/PMC8346626/ /pubmed/34362913 http://dx.doi.org/10.1038/s41467-021-24991-z Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Bee, Callista Chen, Yuan-Jyue Queen, Melissa Ward, David Liu, Xiaomeng Organick, Lee Seelig, Georg Strauss, Karin Ceze, Luis Molecular-level similarity search brings computing to DNA data storage |
title | Molecular-level similarity search brings computing to DNA data storage |
title_full | Molecular-level similarity search brings computing to DNA data storage |
title_fullStr | Molecular-level similarity search brings computing to DNA data storage |
title_full_unstemmed | Molecular-level similarity search brings computing to DNA data storage |
title_short | Molecular-level similarity search brings computing to DNA data storage |
title_sort | molecular-level similarity search brings computing to dna data storage |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8346626/ https://www.ncbi.nlm.nih.gov/pubmed/34362913 http://dx.doi.org/10.1038/s41467-021-24991-z |
work_keys_str_mv | AT beecallista molecularlevelsimilaritysearchbringscomputingtodnadatastorage AT chenyuanjyue molecularlevelsimilaritysearchbringscomputingtodnadatastorage AT queenmelissa molecularlevelsimilaritysearchbringscomputingtodnadatastorage AT warddavid molecularlevelsimilaritysearchbringscomputingtodnadatastorage AT liuxiaomeng molecularlevelsimilaritysearchbringscomputingtodnadatastorage AT organicklee molecularlevelsimilaritysearchbringscomputingtodnadatastorage AT seeliggeorg molecularlevelsimilaritysearchbringscomputingtodnadatastorage AT strausskarin molecularlevelsimilaritysearchbringscomputingtodnadatastorage AT cezeluis molecularlevelsimilaritysearchbringscomputingtodnadatastorage |