Cargando…

Sigmoni: classification of nanopore signal with a compressed pangenome index

Improvements in nanopore sequencing necessitate efficient classification methods, including pre-filtering and adaptive sampling algorithms that enrich for reads of interest. Signal-based approaches circumvent the computational bottleneck of basecalling. But past methods for signal-based classificati...

Descripción completa

Detalles Bibliográficos
Autores principales: Shivakumar, Vikram S., Ahmed, Omar Y., Kovaka, Sam, Zakeri, Mohsen, Langmead, Ben
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10462034/
https://www.ncbi.nlm.nih.gov/pubmed/37645873
http://dx.doi.org/10.1101/2023.08.15.553308
_version_ 1785097977488998400
author Shivakumar, Vikram S.
Ahmed, Omar Y.
Kovaka, Sam
Zakeri, Mohsen
Langmead, Ben
author_facet Shivakumar, Vikram S.
Ahmed, Omar Y.
Kovaka, Sam
Zakeri, Mohsen
Langmead, Ben
author_sort Shivakumar, Vikram S.
collection PubMed
description Improvements in nanopore sequencing necessitate efficient classification methods, including pre-filtering and adaptive sampling algorithms that enrich for reads of interest. Signal-based approaches circumvent the computational bottleneck of basecalling. But past methods for signal-based classification do not scale efficiently to large, repetitive references like pangenomes, limiting their utility to partial references or individual genomes. We introduce Sigmoni: a rapid, multiclass classification method based on the r-index that scales to references of hundreds of Gbps. Sigmoni quantizes nanopore signal into a discrete alphabet of picoamp ranges. It performs rapid, approximate matching using matching statistics, classifying reads based on distributions of picoamp matching statistics and co-linearity statistics. Sigmoni is 10–100× faster than previous methods for adaptive sampling in host depletion experiments with improved accuracy, and can query reads against large microbial or human pangenomes.
format Online
Article
Text
id pubmed-10462034
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-104620342023-08-29 Sigmoni: classification of nanopore signal with a compressed pangenome index Shivakumar, Vikram S. Ahmed, Omar Y. Kovaka, Sam Zakeri, Mohsen Langmead, Ben bioRxiv Article Improvements in nanopore sequencing necessitate efficient classification methods, including pre-filtering and adaptive sampling algorithms that enrich for reads of interest. Signal-based approaches circumvent the computational bottleneck of basecalling. But past methods for signal-based classification do not scale efficiently to large, repetitive references like pangenomes, limiting their utility to partial references or individual genomes. We introduce Sigmoni: a rapid, multiclass classification method based on the r-index that scales to references of hundreds of Gbps. Sigmoni quantizes nanopore signal into a discrete alphabet of picoamp ranges. It performs rapid, approximate matching using matching statistics, classifying reads based on distributions of picoamp matching statistics and co-linearity statistics. Sigmoni is 10–100× faster than previous methods for adaptive sampling in host depletion experiments with improved accuracy, and can query reads against large microbial or human pangenomes. Cold Spring Harbor Laboratory 2023-08-30 /pmc/articles/PMC10462034/ /pubmed/37645873 http://dx.doi.org/10.1101/2023.08.15.553308 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use.
spellingShingle Article
Shivakumar, Vikram S.
Ahmed, Omar Y.
Kovaka, Sam
Zakeri, Mohsen
Langmead, Ben
Sigmoni: classification of nanopore signal with a compressed pangenome index
title Sigmoni: classification of nanopore signal with a compressed pangenome index
title_full Sigmoni: classification of nanopore signal with a compressed pangenome index
title_fullStr Sigmoni: classification of nanopore signal with a compressed pangenome index
title_full_unstemmed Sigmoni: classification of nanopore signal with a compressed pangenome index
title_short Sigmoni: classification of nanopore signal with a compressed pangenome index
title_sort sigmoni: classification of nanopore signal with a compressed pangenome index
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10462034/
https://www.ncbi.nlm.nih.gov/pubmed/37645873
http://dx.doi.org/10.1101/2023.08.15.553308
work_keys_str_mv AT shivakumarvikrams sigmoniclassificationofnanoporesignalwithacompressedpangenomeindex
AT ahmedomary sigmoniclassificationofnanoporesignalwithacompressedpangenomeindex
AT kovakasam sigmoniclassificationofnanoporesignalwithacompressedpangenomeindex
AT zakerimohsen sigmoniclassificationofnanoporesignalwithacompressedpangenomeindex
AT langmeadben sigmoniclassificationofnanoporesignalwithacompressedpangenomeindex