Cargando…

Pan-genomic matching statistics for targeted nanopore sequencing

Nanopore sequencing is an increasingly powerful tool for genomics. Recently, computational advances have allowed nanopores to sequence in a targeted fashion; as the sequencer emits data, software can analyze the data in real time and signal the sequencer to eject “nontarget” DNA molecules. We presen...

Descripción completa

Detalles Bibliográficos
Autores principales: Ahmed, Omar, Rossi, Massimiliano, Kovaka, Sam, Schatz, Michael C., Gagie, Travis, Boucher, Christina, Langmead, Ben
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8237286/
https://www.ncbi.nlm.nih.gov/pubmed/34195571
http://dx.doi.org/10.1016/j.isci.2021.102696
_version_ 1783714699975589888
author Ahmed, Omar
Rossi, Massimiliano
Kovaka, Sam
Schatz, Michael C.
Gagie, Travis
Boucher, Christina
Langmead, Ben
author_facet Ahmed, Omar
Rossi, Massimiliano
Kovaka, Sam
Schatz, Michael C.
Gagie, Travis
Boucher, Christina
Langmead, Ben
author_sort Ahmed, Omar
collection PubMed
description Nanopore sequencing is an increasingly powerful tool for genomics. Recently, computational advances have allowed nanopores to sequence in a targeted fashion; as the sequencer emits data, software can analyze the data in real time and signal the sequencer to eject “nontarget” DNA molecules. We present a novel method called SPUMONI, which enables rapid and accurate targeted sequencing using efficient pan-genome indexes. SPUMONI uses a compressed index to rapidly generate exact or approximate matching statistics in a streaming fashion. When used to target a specific strain in a mock community, SPUMONI has similar accuracy as minimap2 when both are run against an index containing many strains per species. However SPUMONI is 12 times faster than minimap2. SPUMONI's index and peak memory footprint are also 16 to 4 times smaller than those of minimap2, respectively. This could enable accurate targeted sequencing even when the targeted strains have not necessarily been sequenced or assembled previously.
format Online
Article
Text
id pubmed-8237286
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-82372862021-06-29 Pan-genomic matching statistics for targeted nanopore sequencing Ahmed, Omar Rossi, Massimiliano Kovaka, Sam Schatz, Michael C. Gagie, Travis Boucher, Christina Langmead, Ben iScience Article Nanopore sequencing is an increasingly powerful tool for genomics. Recently, computational advances have allowed nanopores to sequence in a targeted fashion; as the sequencer emits data, software can analyze the data in real time and signal the sequencer to eject “nontarget” DNA molecules. We present a novel method called SPUMONI, which enables rapid and accurate targeted sequencing using efficient pan-genome indexes. SPUMONI uses a compressed index to rapidly generate exact or approximate matching statistics in a streaming fashion. When used to target a specific strain in a mock community, SPUMONI has similar accuracy as minimap2 when both are run against an index containing many strains per species. However SPUMONI is 12 times faster than minimap2. SPUMONI's index and peak memory footprint are also 16 to 4 times smaller than those of minimap2, respectively. This could enable accurate targeted sequencing even when the targeted strains have not necessarily been sequenced or assembled previously. Elsevier 2021-06-08 /pmc/articles/PMC8237286/ /pubmed/34195571 http://dx.doi.org/10.1016/j.isci.2021.102696 Text en © 2021 The Author(s) https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Ahmed, Omar
Rossi, Massimiliano
Kovaka, Sam
Schatz, Michael C.
Gagie, Travis
Boucher, Christina
Langmead, Ben
Pan-genomic matching statistics for targeted nanopore sequencing
title Pan-genomic matching statistics for targeted nanopore sequencing
title_full Pan-genomic matching statistics for targeted nanopore sequencing
title_fullStr Pan-genomic matching statistics for targeted nanopore sequencing
title_full_unstemmed Pan-genomic matching statistics for targeted nanopore sequencing
title_short Pan-genomic matching statistics for targeted nanopore sequencing
title_sort pan-genomic matching statistics for targeted nanopore sequencing
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8237286/
https://www.ncbi.nlm.nih.gov/pubmed/34195571
http://dx.doi.org/10.1016/j.isci.2021.102696
work_keys_str_mv AT ahmedomar pangenomicmatchingstatisticsfortargetednanoporesequencing
AT rossimassimiliano pangenomicmatchingstatisticsfortargetednanoporesequencing
AT kovakasam pangenomicmatchingstatisticsfortargetednanoporesequencing
AT schatzmichaelc pangenomicmatchingstatisticsfortargetednanoporesequencing
AT gagietravis pangenomicmatchingstatisticsfortargetednanoporesequencing
AT boucherchristina pangenomicmatchingstatisticsfortargetednanoporesequencing
AT langmeadben pangenomicmatchingstatisticsfortargetednanoporesequencing