Cargando…

Syotti: scalable bait design for DNA enrichment

MOTIVATION: Bait enrichment is a protocol that is becoming increasingly ubiquitous as it has been shown to successfully amplify regions of interest in metagenomic samples. In this method, a set of synthetic probes (‘baits’) are designed, manufactured and applied to fragmented metagenomic DNA. The pr...

Descripción completa

Detalles Bibliográficos
Autores principales: Alanko, Jarno N, Slizovskiy, Ilya B, Lokshtanov, Daniel, Gagie, Travis, Noyes, Noelle R, Boucher, Christina
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9235489/
https://www.ncbi.nlm.nih.gov/pubmed/35758776
http://dx.doi.org/10.1093/bioinformatics/btac226
_version_ 1784736322070511616
author Alanko, Jarno N
Slizovskiy, Ilya B
Lokshtanov, Daniel
Gagie, Travis
Noyes, Noelle R
Boucher, Christina
author_facet Alanko, Jarno N
Slizovskiy, Ilya B
Lokshtanov, Daniel
Gagie, Travis
Noyes, Noelle R
Boucher, Christina
author_sort Alanko, Jarno N
collection PubMed
description MOTIVATION: Bait enrichment is a protocol that is becoming increasingly ubiquitous as it has been shown to successfully amplify regions of interest in metagenomic samples. In this method, a set of synthetic probes (‘baits’) are designed, manufactured and applied to fragmented metagenomic DNA. The probes bind to the fragmented DNA and any unbound DNA is rinsed away, leaving the bound fragments to be amplified for sequencing. Metsky et al. demonstrated that bait-enrichment is capable of detecting a large number of human viral pathogens within metagenomic samples. RESULTS: We formalize the problem of designing baits by defining the Minimum Bait Cover problem, show that the problem is NP-hard even under very restrictive assumptions, and design an efficient heuristic that takes advantage of succinct data structures. We refer to our method as Syotti. The running time of Syotti shows linear scaling in practice, running at least an order of magnitude faster than state-of-the-art methods, including the method of Metsky et al. At the same time, our method produces bait sets that are smaller than the ones produced by the competing methods, while also leaving fewer positions uncovered. Lastly, we show that Syotti requires only 25 min to design baits for a dataset comprised of 3 billion nucleotides from 1000 related bacterial substrains, whereas the method of Metsky et al. shows clearly super-linear running time and fails to process even a subset of 17% of the data in 72 h. AVAILABILITY AND IMPLEMENTATION: https://github.com/jnalanko/syotti. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-9235489
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-92354892022-06-29 Syotti: scalable bait design for DNA enrichment Alanko, Jarno N Slizovskiy, Ilya B Lokshtanov, Daniel Gagie, Travis Noyes, Noelle R Boucher, Christina Bioinformatics ISCB/Ismb 2022 MOTIVATION: Bait enrichment is a protocol that is becoming increasingly ubiquitous as it has been shown to successfully amplify regions of interest in metagenomic samples. In this method, a set of synthetic probes (‘baits’) are designed, manufactured and applied to fragmented metagenomic DNA. The probes bind to the fragmented DNA and any unbound DNA is rinsed away, leaving the bound fragments to be amplified for sequencing. Metsky et al. demonstrated that bait-enrichment is capable of detecting a large number of human viral pathogens within metagenomic samples. RESULTS: We formalize the problem of designing baits by defining the Minimum Bait Cover problem, show that the problem is NP-hard even under very restrictive assumptions, and design an efficient heuristic that takes advantage of succinct data structures. We refer to our method as Syotti. The running time of Syotti shows linear scaling in practice, running at least an order of magnitude faster than state-of-the-art methods, including the method of Metsky et al. At the same time, our method produces bait sets that are smaller than the ones produced by the competing methods, while also leaving fewer positions uncovered. Lastly, we show that Syotti requires only 25 min to design baits for a dataset comprised of 3 billion nucleotides from 1000 related bacterial substrains, whereas the method of Metsky et al. shows clearly super-linear running time and fails to process even a subset of 17% of the data in 72 h. AVAILABILITY AND IMPLEMENTATION: https://github.com/jnalanko/syotti. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-06-27 /pmc/articles/PMC9235489/ /pubmed/35758776 http://dx.doi.org/10.1093/bioinformatics/btac226 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle ISCB/Ismb 2022
Alanko, Jarno N
Slizovskiy, Ilya B
Lokshtanov, Daniel
Gagie, Travis
Noyes, Noelle R
Boucher, Christina
Syotti: scalable bait design for DNA enrichment
title Syotti: scalable bait design for DNA enrichment
title_full Syotti: scalable bait design for DNA enrichment
title_fullStr Syotti: scalable bait design for DNA enrichment
title_full_unstemmed Syotti: scalable bait design for DNA enrichment
title_short Syotti: scalable bait design for DNA enrichment
title_sort syotti: scalable bait design for dna enrichment
topic ISCB/Ismb 2022
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9235489/
https://www.ncbi.nlm.nih.gov/pubmed/35758776
http://dx.doi.org/10.1093/bioinformatics/btac226
work_keys_str_mv AT alankojarnon syottiscalablebaitdesignfordnaenrichment
AT slizovskiyilyab syottiscalablebaitdesignfordnaenrichment
AT lokshtanovdaniel syottiscalablebaitdesignfordnaenrichment
AT gagietravis syottiscalablebaitdesignfordnaenrichment
AT noyesnoeller syottiscalablebaitdesignfordnaenrichment
AT boucherchristina syottiscalablebaitdesignfordnaenrichment