Cargando…

Improved Genomic Identification, Clustering, and Serotyping of Shiga Toxin-Producing Escherichia coli Using Cluster/Serotype-Specific Gene Markers

Shiga toxin-producing Escherichia coli (STEC) have more than 470 serotypes. The well-known STEC O157:H7 serotype is a leading cause of STEC infections in humans. However, the incidence of non-O157:H7 STEC serotypes associated with foodborne outbreaks and human infections has increased in recent year...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Xiaomei, Payne, Michael, Kaur, Sandeep, Lan, Ruiting
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8785982/
https://www.ncbi.nlm.nih.gov/pubmed/35083165
http://dx.doi.org/10.3389/fcimb.2021.772574
_version_ 1784639044888559616
author Zhang, Xiaomei
Payne, Michael
Kaur, Sandeep
Lan, Ruiting
author_facet Zhang, Xiaomei
Payne, Michael
Kaur, Sandeep
Lan, Ruiting
author_sort Zhang, Xiaomei
collection PubMed
description Shiga toxin-producing Escherichia coli (STEC) have more than 470 serotypes. The well-known STEC O157:H7 serotype is a leading cause of STEC infections in humans. However, the incidence of non-O157:H7 STEC serotypes associated with foodborne outbreaks and human infections has increased in recent years. Current detection and serotyping assays are focusing on O157 and top six (“Big six”) non-O157 STEC serogroups. In this study, we performed phylogenetic analysis of nearly 41,000 publicly available STEC genomes representing 460 different STEC serotypes and identified 19 major and 229 minor STEC clusters. STEC cluster-specific gene markers were then identified through comparative genomic analysis. We further identified serotype-specific gene markers for the top 10 most frequent non-O157:H7 STEC serotypes. The cluster or serotype specific gene markers had 99.54% accuracy and more than 97.25% specificity when tested using 38,534 STEC and 14,216 non-STEC E. coli genomes, respectively. In addition, we developed a freely available in silico serotyping pipeline named STECFinder that combined these robust gene markers with established E. coli serotype specific O and H antigen genes and stx genes for accurate identification, cluster determination and serotyping of STEC. STECFinder can assign 99.85% and 99.83% of 38,534 STEC isolates to STEC clusters using assembled genomes and Illumina reads respectively and can simultaneously predict stx subtypes and STEC serotypes. Using shotgun metagenomic sequencing reads of STEC spiked food samples from a published study, we demonstrated that STECFinder can detect the spiked STEC serotypes, accurately. The cluster/serotype-specific gene markers could also be adapted for culture independent typing, facilitating rapid STEC typing. STECFinder is available as an installable package (https://github.com/LanLab/STECFinder) and will be useful for in silico STEC cluster identification and serotyping using genome data.
format Online
Article
Text
id pubmed-8785982
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-87859822022-01-25 Improved Genomic Identification, Clustering, and Serotyping of Shiga Toxin-Producing Escherichia coli Using Cluster/Serotype-Specific Gene Markers Zhang, Xiaomei Payne, Michael Kaur, Sandeep Lan, Ruiting Front Cell Infect Microbiol Cellular and Infection Microbiology Shiga toxin-producing Escherichia coli (STEC) have more than 470 serotypes. The well-known STEC O157:H7 serotype is a leading cause of STEC infections in humans. However, the incidence of non-O157:H7 STEC serotypes associated with foodborne outbreaks and human infections has increased in recent years. Current detection and serotyping assays are focusing on O157 and top six (“Big six”) non-O157 STEC serogroups. In this study, we performed phylogenetic analysis of nearly 41,000 publicly available STEC genomes representing 460 different STEC serotypes and identified 19 major and 229 minor STEC clusters. STEC cluster-specific gene markers were then identified through comparative genomic analysis. We further identified serotype-specific gene markers for the top 10 most frequent non-O157:H7 STEC serotypes. The cluster or serotype specific gene markers had 99.54% accuracy and more than 97.25% specificity when tested using 38,534 STEC and 14,216 non-STEC E. coli genomes, respectively. In addition, we developed a freely available in silico serotyping pipeline named STECFinder that combined these robust gene markers with established E. coli serotype specific O and H antigen genes and stx genes for accurate identification, cluster determination and serotyping of STEC. STECFinder can assign 99.85% and 99.83% of 38,534 STEC isolates to STEC clusters using assembled genomes and Illumina reads respectively and can simultaneously predict stx subtypes and STEC serotypes. Using shotgun metagenomic sequencing reads of STEC spiked food samples from a published study, we demonstrated that STECFinder can detect the spiked STEC serotypes, accurately. The cluster/serotype-specific gene markers could also be adapted for culture independent typing, facilitating rapid STEC typing. STECFinder is available as an installable package (https://github.com/LanLab/STECFinder) and will be useful for in silico STEC cluster identification and serotyping using genome data. Frontiers Media S.A. 2022-01-10 /pmc/articles/PMC8785982/ /pubmed/35083165 http://dx.doi.org/10.3389/fcimb.2021.772574 Text en Copyright © 2022 Zhang, Payne, Kaur and Lan https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Cellular and Infection Microbiology
Zhang, Xiaomei
Payne, Michael
Kaur, Sandeep
Lan, Ruiting
Improved Genomic Identification, Clustering, and Serotyping of Shiga Toxin-Producing Escherichia coli Using Cluster/Serotype-Specific Gene Markers
title Improved Genomic Identification, Clustering, and Serotyping of Shiga Toxin-Producing Escherichia coli Using Cluster/Serotype-Specific Gene Markers
title_full Improved Genomic Identification, Clustering, and Serotyping of Shiga Toxin-Producing Escherichia coli Using Cluster/Serotype-Specific Gene Markers
title_fullStr Improved Genomic Identification, Clustering, and Serotyping of Shiga Toxin-Producing Escherichia coli Using Cluster/Serotype-Specific Gene Markers
title_full_unstemmed Improved Genomic Identification, Clustering, and Serotyping of Shiga Toxin-Producing Escherichia coli Using Cluster/Serotype-Specific Gene Markers
title_short Improved Genomic Identification, Clustering, and Serotyping of Shiga Toxin-Producing Escherichia coli Using Cluster/Serotype-Specific Gene Markers
title_sort improved genomic identification, clustering, and serotyping of shiga toxin-producing escherichia coli using cluster/serotype-specific gene markers
topic Cellular and Infection Microbiology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8785982/
https://www.ncbi.nlm.nih.gov/pubmed/35083165
http://dx.doi.org/10.3389/fcimb.2021.772574
work_keys_str_mv AT zhangxiaomei improvedgenomicidentificationclusteringandserotypingofshigatoxinproducingescherichiacoliusingclusterserotypespecificgenemarkers
AT paynemichael improvedgenomicidentificationclusteringandserotypingofshigatoxinproducingescherichiacoliusingclusterserotypespecificgenemarkers
AT kaursandeep improvedgenomicidentificationclusteringandserotypingofshigatoxinproducingescherichiacoliusingclusterserotypespecificgenemarkers
AT lanruiting improvedgenomicidentificationclusteringandserotypingofshigatoxinproducingescherichiacoliusingclusterserotypespecificgenemarkers