Cargando…

baseLess: lightweight detection of sequences in raw MinION data

SUMMARY: With its candybar form factor and low initial investment cost, the MinION brought affordable portable nucleic acid analysis within reach. However, translating the electrical signal it outputs into a sequence of bases still requires mid-tier computer hardware, which remains a caveat when aim...

Descripción completa

Detalles Bibliográficos
Autores principales: Noordijk, Ben, Nijland, Reindert, Carrion, Victor J, Raaijmakers, Jos M, de Ridder, Dick, de Lannoy, Carlos
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9936955/
https://www.ncbi.nlm.nih.gov/pubmed/36818730
http://dx.doi.org/10.1093/bioadv/vbad017
Descripción
Sumario:SUMMARY: With its candybar form factor and low initial investment cost, the MinION brought affordable portable nucleic acid analysis within reach. However, translating the electrical signal it outputs into a sequence of bases still requires mid-tier computer hardware, which remains a caveat when aiming for deployment of many devices at once or usage in remote areas. For applications focusing on detection of a target sequence, such as infectious disease monitoring or species identification, the computational cost of analysis may be reduced by directly detecting the target sequence in the electrical signal instead. Here, we present baseLess, a computational tool that enables such target-detection-only analysis. BaseLess makes use of an array of small neural networks, each of which efficiently detects a fixed-size subsequence of the target sequence directly from the electrical signal. We show that baseLess can accurately determine the identity of reads between three closely related fish species and can classify sequences in mixtures of 20 bacterial species, on an inexpensive single-board computer. AVAILABILITY AND IMPLEMENTATION: baseLess and all code used in data preparation and validation are available on Github at https://github.com/cvdelannoy/baseLess, under an MIT license. Used validation data and scripts can be found at https://doi.org/10.4121/20261392, under an MIT license. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online.