Cargando…
FrameD: framework for DNA-based data storage design, verification, and validation
MOTIVATION: DNA-based data storage is a quickly growing field that hopes to harness the massive theoretical information density of DNA molecules to produce a competitive next-generation storage medium suitable for archival data. In recent years, many DNA-based storage system designs have been propos...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10563143/ https://www.ncbi.nlm.nih.gov/pubmed/37713474 http://dx.doi.org/10.1093/bioinformatics/btad572 |
_version_ | 1785118276625367040 |
---|---|
author | Volkel, Kevin D Lin, Kevin N Hook, Paul W Timp, Winston Keung, Albert J Tuck, James M |
author_facet | Volkel, Kevin D Lin, Kevin N Hook, Paul W Timp, Winston Keung, Albert J Tuck, James M |
author_sort | Volkel, Kevin D |
collection | PubMed |
description | MOTIVATION: DNA-based data storage is a quickly growing field that hopes to harness the massive theoretical information density of DNA molecules to produce a competitive next-generation storage medium suitable for archival data. In recent years, many DNA-based storage system designs have been proposed. Given that no common infrastructure exists for simulating these storage systems, comparing many different designs along with many different error models is increasingly difficult. To address this challenge, we introduce FrameD, a simulation infrastructure for DNA storage systems that leverages the underlying modularity of DNA storage system designs to provide a framework to express different designs while being able to reuse common components. RESULTS: We demonstrate the utility of FrameD and the need for a common simulation platform using a case study. Our case study compares designs that utilize strand copies differently, some that align strand copies using multiple sequence alignment algorithms and others that do not. We found that the choice to include multiple sequence alignment in the pipeline is dependent on the error rate and the type of errors being injected and is not always beneficial. In addition to supporting a wide range of designs, FrameD provides the user with transparent parallelism to deal with a large number of reads from sequencing and the need for many fault injection iterations. We believe that FrameD fills a void in the tools publicly available to the DNA storage community by providing a modular and extensible framework with support for massive parallelism. As a result, it will help accelerate the design process of future DNA-based storage systems. AVAILABILITY AND IMPLEMENTATION: The source code for FrameD along with the data generated during the demonstration of FrameD is available in a public Github repository at https://github.com/dna-storage/framed, (https://dx.doi.org/10.5281/zenodo.7757762). |
format | Online Article Text |
id | pubmed-10563143 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-105631432023-10-11 FrameD: framework for DNA-based data storage design, verification, and validation Volkel, Kevin D Lin, Kevin N Hook, Paul W Timp, Winston Keung, Albert J Tuck, James M Bioinformatics Original Paper MOTIVATION: DNA-based data storage is a quickly growing field that hopes to harness the massive theoretical information density of DNA molecules to produce a competitive next-generation storage medium suitable for archival data. In recent years, many DNA-based storage system designs have been proposed. Given that no common infrastructure exists for simulating these storage systems, comparing many different designs along with many different error models is increasingly difficult. To address this challenge, we introduce FrameD, a simulation infrastructure for DNA storage systems that leverages the underlying modularity of DNA storage system designs to provide a framework to express different designs while being able to reuse common components. RESULTS: We demonstrate the utility of FrameD and the need for a common simulation platform using a case study. Our case study compares designs that utilize strand copies differently, some that align strand copies using multiple sequence alignment algorithms and others that do not. We found that the choice to include multiple sequence alignment in the pipeline is dependent on the error rate and the type of errors being injected and is not always beneficial. In addition to supporting a wide range of designs, FrameD provides the user with transparent parallelism to deal with a large number of reads from sequencing and the need for many fault injection iterations. We believe that FrameD fills a void in the tools publicly available to the DNA storage community by providing a modular and extensible framework with support for massive parallelism. As a result, it will help accelerate the design process of future DNA-based storage systems. AVAILABILITY AND IMPLEMENTATION: The source code for FrameD along with the data generated during the demonstration of FrameD is available in a public Github repository at https://github.com/dna-storage/framed, (https://dx.doi.org/10.5281/zenodo.7757762). Oxford University Press 2023-09-15 /pmc/articles/PMC10563143/ /pubmed/37713474 http://dx.doi.org/10.1093/bioinformatics/btad572 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Paper Volkel, Kevin D Lin, Kevin N Hook, Paul W Timp, Winston Keung, Albert J Tuck, James M FrameD: framework for DNA-based data storage design, verification, and validation |
title | FrameD: framework for DNA-based data storage design, verification, and validation |
title_full | FrameD: framework for DNA-based data storage design, verification, and validation |
title_fullStr | FrameD: framework for DNA-based data storage design, verification, and validation |
title_full_unstemmed | FrameD: framework for DNA-based data storage design, verification, and validation |
title_short | FrameD: framework for DNA-based data storage design, verification, and validation |
title_sort | framed: framework for dna-based data storage design, verification, and validation |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10563143/ https://www.ncbi.nlm.nih.gov/pubmed/37713474 http://dx.doi.org/10.1093/bioinformatics/btad572 |
work_keys_str_mv | AT volkelkevind framedframeworkfordnabaseddatastoragedesignverificationandvalidation AT linkevinn framedframeworkfordnabaseddatastoragedesignverificationandvalidation AT hookpaulw framedframeworkfordnabaseddatastoragedesignverificationandvalidation AT timpwinston framedframeworkfordnabaseddatastoragedesignverificationandvalidation AT keungalbertj framedframeworkfordnabaseddatastoragedesignverificationandvalidation AT tuckjamesm framedframeworkfordnabaseddatastoragedesignverificationandvalidation |