Cargando…

Expanding the Molecular Alphabet of DNA-Based Data Storage Systems with Neural Network Nanopore Readout Processing

[Image: see text] DNA is a promising next-generation data storage medium, but challenges remain with synthesis costs and recording latency. Here, we describe a prototype of a DNA data storage system that uses an extended molecular alphabet combining natural and chemically modified nucleotides. Our r...

Descripción completa

Detalles Bibliográficos
Autores principales: Tabatabaei, S. Kasra, Pham, Bach, Pan, Chao, Liu, Jingqian, Chandak, Shubham, Shorkey, Spencer A., Hernandez, Alvaro G., Aksimentiev, Aleksei, Chen, Min, Schroeder, Charles M., Milenkovic, Olgica
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2022
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8915253/
https://www.ncbi.nlm.nih.gov/pubmed/35212544
http://dx.doi.org/10.1021/acs.nanolett.1c04203
_version_ 1784667976607203328
author Tabatabaei, S. Kasra
Pham, Bach
Pan, Chao
Liu, Jingqian
Chandak, Shubham
Shorkey, Spencer A.
Hernandez, Alvaro G.
Aksimentiev, Aleksei
Chen, Min
Schroeder, Charles M.
Milenkovic, Olgica
author_facet Tabatabaei, S. Kasra
Pham, Bach
Pan, Chao
Liu, Jingqian
Chandak, Shubham
Shorkey, Spencer A.
Hernandez, Alvaro G.
Aksimentiev, Aleksei
Chen, Min
Schroeder, Charles M.
Milenkovic, Olgica
author_sort Tabatabaei, S. Kasra
collection PubMed
description [Image: see text] DNA is a promising next-generation data storage medium, but challenges remain with synthesis costs and recording latency. Here, we describe a prototype of a DNA data storage system that uses an extended molecular alphabet combining natural and chemically modified nucleotides. Our results show that MspA nanopores can discriminate different combinations and ordered sequences of natural and chemically modified nucleotides in custom-designed oligomers. We further demonstrate single-molecule sequencing of the extended alphabet using a neural network architecture that classifies raw current signals generated by Oxford Nanopore sequencers with an average accuracy exceeding 60% (39× larger than random guessing). Molecular dynamics simulations show that the majority of modified nucleotides lead to only minor perturbations of the DNA double helix. Overall, the extended molecular alphabet may potentially offer a nearly 2-fold increase in storage density and potentially the same order of reduction in the recording latency, thereby enabling new implementations of molecular recorders.
format Online
Article
Text
id pubmed-8915253
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-89152532022-03-14 Expanding the Molecular Alphabet of DNA-Based Data Storage Systems with Neural Network Nanopore Readout Processing Tabatabaei, S. Kasra Pham, Bach Pan, Chao Liu, Jingqian Chandak, Shubham Shorkey, Spencer A. Hernandez, Alvaro G. Aksimentiev, Aleksei Chen, Min Schroeder, Charles M. Milenkovic, Olgica Nano Lett [Image: see text] DNA is a promising next-generation data storage medium, but challenges remain with synthesis costs and recording latency. Here, we describe a prototype of a DNA data storage system that uses an extended molecular alphabet combining natural and chemically modified nucleotides. Our results show that MspA nanopores can discriminate different combinations and ordered sequences of natural and chemically modified nucleotides in custom-designed oligomers. We further demonstrate single-molecule sequencing of the extended alphabet using a neural network architecture that classifies raw current signals generated by Oxford Nanopore sequencers with an average accuracy exceeding 60% (39× larger than random guessing). Molecular dynamics simulations show that the majority of modified nucleotides lead to only minor perturbations of the DNA double helix. Overall, the extended molecular alphabet may potentially offer a nearly 2-fold increase in storage density and potentially the same order of reduction in the recording latency, thereby enabling new implementations of molecular recorders. American Chemical Society 2022-02-25 2022-03-09 /pmc/articles/PMC8915253/ /pubmed/35212544 http://dx.doi.org/10.1021/acs.nanolett.1c04203 Text en © 2022 The Authors. Published by American Chemical Society https://creativecommons.org/licenses/by-nc-nd/4.0/Permits non-commercial access and re-use, provided that author attribution and integrity are maintained; but does not permit creation of adaptations or other derivative works (https://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Tabatabaei, S. Kasra
Pham, Bach
Pan, Chao
Liu, Jingqian
Chandak, Shubham
Shorkey, Spencer A.
Hernandez, Alvaro G.
Aksimentiev, Aleksei
Chen, Min
Schroeder, Charles M.
Milenkovic, Olgica
Expanding the Molecular Alphabet of DNA-Based Data Storage Systems with Neural Network Nanopore Readout Processing
title Expanding the Molecular Alphabet of DNA-Based Data Storage Systems with Neural Network Nanopore Readout Processing
title_full Expanding the Molecular Alphabet of DNA-Based Data Storage Systems with Neural Network Nanopore Readout Processing
title_fullStr Expanding the Molecular Alphabet of DNA-Based Data Storage Systems with Neural Network Nanopore Readout Processing
title_full_unstemmed Expanding the Molecular Alphabet of DNA-Based Data Storage Systems with Neural Network Nanopore Readout Processing
title_short Expanding the Molecular Alphabet of DNA-Based Data Storage Systems with Neural Network Nanopore Readout Processing
title_sort expanding the molecular alphabet of dna-based data storage systems with neural network nanopore readout processing
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8915253/
https://www.ncbi.nlm.nih.gov/pubmed/35212544
http://dx.doi.org/10.1021/acs.nanolett.1c04203
work_keys_str_mv AT tabatabaeiskasra expandingthemolecularalphabetofdnabaseddatastoragesystemswithneuralnetworknanoporereadoutprocessing
AT phambach expandingthemolecularalphabetofdnabaseddatastoragesystemswithneuralnetworknanoporereadoutprocessing
AT panchao expandingthemolecularalphabetofdnabaseddatastoragesystemswithneuralnetworknanoporereadoutprocessing
AT liujingqian expandingthemolecularalphabetofdnabaseddatastoragesystemswithneuralnetworknanoporereadoutprocessing
AT chandakshubham expandingthemolecularalphabetofdnabaseddatastoragesystemswithneuralnetworknanoporereadoutprocessing
AT shorkeyspencera expandingthemolecularalphabetofdnabaseddatastoragesystemswithneuralnetworknanoporereadoutprocessing
AT hernandezalvarog expandingthemolecularalphabetofdnabaseddatastoragesystemswithneuralnetworknanoporereadoutprocessing
AT aksimentievaleksei expandingthemolecularalphabetofdnabaseddatastoragesystemswithneuralnetworknanoporereadoutprocessing
AT chenmin expandingthemolecularalphabetofdnabaseddatastoragesystemswithneuralnetworknanoporereadoutprocessing
AT schroedercharlesm expandingthemolecularalphabetofdnabaseddatastoragesystemswithneuralnetworknanoporereadoutprocessing
AT milenkovicolgica expandingthemolecularalphabetofdnabaseddatastoragesystemswithneuralnetworknanoporereadoutprocessing