Cargando…

Rapid and accurate SNP genotyping of clonal bacterial pathogens with BioHansel

Hierarchical genotyping approaches can provide insights into the source, geography and temporal distribution of bacterial pathogens. Multiple hierarchical SNP genotyping schemes have previously been developed so that new isolates can rapidly be placed within pre-computed population structures, witho...

Descripción completa

Detalles Bibliográficos
Autores principales: Labbé, Geneviève, Kruczkiewicz, Peter, Robertson, James, Mabon, Philip, Schonfeld, Justin, Kein, Daniel, Rankin, Marisa A., Gopez, Matthew, Hole, Darian, Son, David, Knox, Natalie, Laing, Chad R., Bessonov, Kyrylo, Taboada, Eduardo N., Yoshida, Catherine, Ziebell, Kim, Nichani, Anil, Johnson, Roger P., Van Domselaar, Gary, Nash, John H. E.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Microbiology Society 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8715432/
https://www.ncbi.nlm.nih.gov/pubmed/34554082
http://dx.doi.org/10.1099/mgen.0.000651
_version_ 1784624125589848064
author Labbé, Geneviève
Kruczkiewicz, Peter
Robertson, James
Mabon, Philip
Schonfeld, Justin
Kein, Daniel
Rankin, Marisa A.
Gopez, Matthew
Hole, Darian
Son, David
Knox, Natalie
Laing, Chad R.
Bessonov, Kyrylo
Taboada, Eduardo N.
Yoshida, Catherine
Ziebell, Kim
Nichani, Anil
Johnson, Roger P.
Van Domselaar, Gary
Nash, John H. E.
author_facet Labbé, Geneviève
Kruczkiewicz, Peter
Robertson, James
Mabon, Philip
Schonfeld, Justin
Kein, Daniel
Rankin, Marisa A.
Gopez, Matthew
Hole, Darian
Son, David
Knox, Natalie
Laing, Chad R.
Bessonov, Kyrylo
Taboada, Eduardo N.
Yoshida, Catherine
Ziebell, Kim
Nichani, Anil
Johnson, Roger P.
Van Domselaar, Gary
Nash, John H. E.
author_sort Labbé, Geneviève
collection PubMed
description Hierarchical genotyping approaches can provide insights into the source, geography and temporal distribution of bacterial pathogens. Multiple hierarchical SNP genotyping schemes have previously been developed so that new isolates can rapidly be placed within pre-computed population structures, without the need to rebuild phylogenetic trees for the entire dataset. This classification approach has, however, seen limited uptake in routine public health settings due to analytical complexity and the lack of standardized tools that provide clear and easy ways to interpret results. The BioHansel tool was developed to provide an organism-agnostic tool for hierarchical SNP-based genotyping. The tool identifies split k-mers that distinguish predefined lineages in whole genome sequencing (WGS) data using SNP-based genotyping schemes. BioHansel uses the Aho-Corasick algorithm to type isolates from assembled genomes or raw read sequence data in a matter of seconds, with limited computational resources. This makes BioHansel ideal for use by public health agencies that rely on WGS methods for surveillance of bacterial pathogens. Genotyping results are evaluated using a quality assurance module which identifies problematic samples, such as low-quality or contaminated datasets. Using existing hierarchical SNP schemes for Mycobacterium tuberculosis and Salmonella Typhi, we compare the genotyping results obtained with the k-mer-based tools BioHansel and SKA, with those of the organism-specific tools TBProfiler and genotyphi, which use gold-standard reference-mapping approaches. We show that the genotyping results are fully concordant across these different methods, and that the k-mer-based tools are significantly faster. We also test the ability of the BioHansel quality assurance module to detect intra-lineage contamination and demonstrate that it is effective, even in populations with low genetic diversity. We demonstrate the scalability of the tool using a dataset of ~8100 S. Typhi public genomes and provide the aggregated results of geographical distributions as part of the tool’s output. BioHansel is an open source Python 3 application available on PyPI and Conda repositories and as a Galaxy tool from the public Galaxy Toolshed. In a public health context, BioHansel enables rapid and high-resolution classification of bacterial pathogens with low genetic diversity.
format Online
Article
Text
id pubmed-8715432
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Microbiology Society
record_format MEDLINE/PubMed
spelling pubmed-87154322021-12-29 Rapid and accurate SNP genotyping of clonal bacterial pathogens with BioHansel Labbé, Geneviève Kruczkiewicz, Peter Robertson, James Mabon, Philip Schonfeld, Justin Kein, Daniel Rankin, Marisa A. Gopez, Matthew Hole, Darian Son, David Knox, Natalie Laing, Chad R. Bessonov, Kyrylo Taboada, Eduardo N. Yoshida, Catherine Ziebell, Kim Nichani, Anil Johnson, Roger P. Van Domselaar, Gary Nash, John H. E. Microb Genom Research Articles Hierarchical genotyping approaches can provide insights into the source, geography and temporal distribution of bacterial pathogens. Multiple hierarchical SNP genotyping schemes have previously been developed so that new isolates can rapidly be placed within pre-computed population structures, without the need to rebuild phylogenetic trees for the entire dataset. This classification approach has, however, seen limited uptake in routine public health settings due to analytical complexity and the lack of standardized tools that provide clear and easy ways to interpret results. The BioHansel tool was developed to provide an organism-agnostic tool for hierarchical SNP-based genotyping. The tool identifies split k-mers that distinguish predefined lineages in whole genome sequencing (WGS) data using SNP-based genotyping schemes. BioHansel uses the Aho-Corasick algorithm to type isolates from assembled genomes or raw read sequence data in a matter of seconds, with limited computational resources. This makes BioHansel ideal for use by public health agencies that rely on WGS methods for surveillance of bacterial pathogens. Genotyping results are evaluated using a quality assurance module which identifies problematic samples, such as low-quality or contaminated datasets. Using existing hierarchical SNP schemes for Mycobacterium tuberculosis and Salmonella Typhi, we compare the genotyping results obtained with the k-mer-based tools BioHansel and SKA, with those of the organism-specific tools TBProfiler and genotyphi, which use gold-standard reference-mapping approaches. We show that the genotyping results are fully concordant across these different methods, and that the k-mer-based tools are significantly faster. We also test the ability of the BioHansel quality assurance module to detect intra-lineage contamination and demonstrate that it is effective, even in populations with low genetic diversity. We demonstrate the scalability of the tool using a dataset of ~8100 S. Typhi public genomes and provide the aggregated results of geographical distributions as part of the tool’s output. BioHansel is an open source Python 3 application available on PyPI and Conda repositories and as a Galaxy tool from the public Galaxy Toolshed. In a public health context, BioHansel enables rapid and high-resolution classification of bacterial pathogens with low genetic diversity. Microbiology Society 2021-09-23 /pmc/articles/PMC8715432/ /pubmed/34554082 http://dx.doi.org/10.1099/mgen.0.000651 Text en © 2021 The Authors https://creativecommons.org/licenses/by-nc/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution NonCommercial License.
spellingShingle Research Articles
Labbé, Geneviève
Kruczkiewicz, Peter
Robertson, James
Mabon, Philip
Schonfeld, Justin
Kein, Daniel
Rankin, Marisa A.
Gopez, Matthew
Hole, Darian
Son, David
Knox, Natalie
Laing, Chad R.
Bessonov, Kyrylo
Taboada, Eduardo N.
Yoshida, Catherine
Ziebell, Kim
Nichani, Anil
Johnson, Roger P.
Van Domselaar, Gary
Nash, John H. E.
Rapid and accurate SNP genotyping of clonal bacterial pathogens with BioHansel
title Rapid and accurate SNP genotyping of clonal bacterial pathogens with BioHansel
title_full Rapid and accurate SNP genotyping of clonal bacterial pathogens with BioHansel
title_fullStr Rapid and accurate SNP genotyping of clonal bacterial pathogens with BioHansel
title_full_unstemmed Rapid and accurate SNP genotyping of clonal bacterial pathogens with BioHansel
title_short Rapid and accurate SNP genotyping of clonal bacterial pathogens with BioHansel
title_sort rapid and accurate snp genotyping of clonal bacterial pathogens with biohansel
topic Research Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8715432/
https://www.ncbi.nlm.nih.gov/pubmed/34554082
http://dx.doi.org/10.1099/mgen.0.000651
work_keys_str_mv AT labbegenevieve rapidandaccuratesnpgenotypingofclonalbacterialpathogenswithbiohansel
AT kruczkiewiczpeter rapidandaccuratesnpgenotypingofclonalbacterialpathogenswithbiohansel
AT robertsonjames rapidandaccuratesnpgenotypingofclonalbacterialpathogenswithbiohansel
AT mabonphilip rapidandaccuratesnpgenotypingofclonalbacterialpathogenswithbiohansel
AT schonfeldjustin rapidandaccuratesnpgenotypingofclonalbacterialpathogenswithbiohansel
AT keindaniel rapidandaccuratesnpgenotypingofclonalbacterialpathogenswithbiohansel
AT rankinmarisaa rapidandaccuratesnpgenotypingofclonalbacterialpathogenswithbiohansel
AT gopezmatthew rapidandaccuratesnpgenotypingofclonalbacterialpathogenswithbiohansel
AT holedarian rapidandaccuratesnpgenotypingofclonalbacterialpathogenswithbiohansel
AT sondavid rapidandaccuratesnpgenotypingofclonalbacterialpathogenswithbiohansel
AT knoxnatalie rapidandaccuratesnpgenotypingofclonalbacterialpathogenswithbiohansel
AT laingchadr rapidandaccuratesnpgenotypingofclonalbacterialpathogenswithbiohansel
AT bessonovkyrylo rapidandaccuratesnpgenotypingofclonalbacterialpathogenswithbiohansel
AT taboadaeduardon rapidandaccuratesnpgenotypingofclonalbacterialpathogenswithbiohansel
AT yoshidacatherine rapidandaccuratesnpgenotypingofclonalbacterialpathogenswithbiohansel
AT ziebellkim rapidandaccuratesnpgenotypingofclonalbacterialpathogenswithbiohansel
AT nichanianil rapidandaccuratesnpgenotypingofclonalbacterialpathogenswithbiohansel
AT johnsonrogerp rapidandaccuratesnpgenotypingofclonalbacterialpathogenswithbiohansel
AT vandomselaargary rapidandaccuratesnpgenotypingofclonalbacterialpathogenswithbiohansel
AT nashjohnhe rapidandaccuratesnpgenotypingofclonalbacterialpathogenswithbiohansel