Cargando…

Genome Informatics and Machine Learning-Based Identification of Antimicrobial Resistance-Encoding Features and Virulence Attributes in Escherichia coli Genomes Representing Globally Prevalent Lineages, Including High-Risk Clonal Complexes

Escherichia coli, a ubiquitous commensal/pathogenic member from the Enterobacteriaceae family, accounts for high infection burden, morbidity, and mortality throughout the world. With emerging multidrug resistance (MDR) on a massive scale, E. coli has been listed as one of the Global Antimicrobial Re...

Descripción completa

Detalles Bibliográficos
Autores principales: Shaik, Sabiha, Singh, Anuradha, Suresh, Arya, Ahmed, Niyaz
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society for Microbiology 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8844930/
https://www.ncbi.nlm.nih.gov/pubmed/35164570
http://dx.doi.org/10.1128/mbio.03796-21
_version_ 1784651572501807104
author Shaik, Sabiha
Singh, Anuradha
Suresh, Arya
Ahmed, Niyaz
author_facet Shaik, Sabiha
Singh, Anuradha
Suresh, Arya
Ahmed, Niyaz
author_sort Shaik, Sabiha
collection PubMed
description Escherichia coli, a ubiquitous commensal/pathogenic member from the Enterobacteriaceae family, accounts for high infection burden, morbidity, and mortality throughout the world. With emerging multidrug resistance (MDR) on a massive scale, E. coli has been listed as one of the Global Antimicrobial Resistance and Use Surveillance System (GLASS) priority pathogens. Understanding the resistance mechanisms and underlying genomic features appears to be of utmost importance to tackle further spread of these multidrug-resistant superbugs. While a few of the globally prevalent sequence types (STs) of E. coli, such as ST131, ST69, ST405, and ST648, have been previously reported to be highly virulent and harboring MDR, there is no clarity if certain ST lineages have a greater propensity to acquire MDR. In this study, large-scale comparative genomics of a total of 5,653 E. coli genomes from 19 ST lineages revealed ST-wide prevalence patterns of genomic features, such as antimicrobial resistance (AMR)-encoding genes/mutations, virulence genes, integrons, and transposons. Interpretation of the importance of these features using a Random Forest Classifier trained with 11,988 genomic features from whole-genome sequence data identified ST-specific or phylogroup-specific signature proteins mostly belonging to different protein superfamilies, including the toxin-antitoxin systems. Our study provides a comprehensive understanding of a myriad of genomic features, ST-specific proteins, and resistance mechanisms entailing different lineages of E. coli at the level of genomes; this could be of significant downstream importance in understanding the mechanisms of AMR, in clinical discovery, in epidemiology, and in devising control strategies.
format Online
Article
Text
id pubmed-8844930
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher American Society for Microbiology
record_format MEDLINE/PubMed
spelling pubmed-88449302022-02-17 Genome Informatics and Machine Learning-Based Identification of Antimicrobial Resistance-Encoding Features and Virulence Attributes in Escherichia coli Genomes Representing Globally Prevalent Lineages, Including High-Risk Clonal Complexes Shaik, Sabiha Singh, Anuradha Suresh, Arya Ahmed, Niyaz mBio Research Article Escherichia coli, a ubiquitous commensal/pathogenic member from the Enterobacteriaceae family, accounts for high infection burden, morbidity, and mortality throughout the world. With emerging multidrug resistance (MDR) on a massive scale, E. coli has been listed as one of the Global Antimicrobial Resistance and Use Surveillance System (GLASS) priority pathogens. Understanding the resistance mechanisms and underlying genomic features appears to be of utmost importance to tackle further spread of these multidrug-resistant superbugs. While a few of the globally prevalent sequence types (STs) of E. coli, such as ST131, ST69, ST405, and ST648, have been previously reported to be highly virulent and harboring MDR, there is no clarity if certain ST lineages have a greater propensity to acquire MDR. In this study, large-scale comparative genomics of a total of 5,653 E. coli genomes from 19 ST lineages revealed ST-wide prevalence patterns of genomic features, such as antimicrobial resistance (AMR)-encoding genes/mutations, virulence genes, integrons, and transposons. Interpretation of the importance of these features using a Random Forest Classifier trained with 11,988 genomic features from whole-genome sequence data identified ST-specific or phylogroup-specific signature proteins mostly belonging to different protein superfamilies, including the toxin-antitoxin systems. Our study provides a comprehensive understanding of a myriad of genomic features, ST-specific proteins, and resistance mechanisms entailing different lineages of E. coli at the level of genomes; this could be of significant downstream importance in understanding the mechanisms of AMR, in clinical discovery, in epidemiology, and in devising control strategies. American Society for Microbiology 2022-02-15 /pmc/articles/PMC8844930/ /pubmed/35164570 http://dx.doi.org/10.1128/mbio.03796-21 Text en Copyright © 2022 Shaik et al. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Research Article
Shaik, Sabiha
Singh, Anuradha
Suresh, Arya
Ahmed, Niyaz
Genome Informatics and Machine Learning-Based Identification of Antimicrobial Resistance-Encoding Features and Virulence Attributes in Escherichia coli Genomes Representing Globally Prevalent Lineages, Including High-Risk Clonal Complexes
title Genome Informatics and Machine Learning-Based Identification of Antimicrobial Resistance-Encoding Features and Virulence Attributes in Escherichia coli Genomes Representing Globally Prevalent Lineages, Including High-Risk Clonal Complexes
title_full Genome Informatics and Machine Learning-Based Identification of Antimicrobial Resistance-Encoding Features and Virulence Attributes in Escherichia coli Genomes Representing Globally Prevalent Lineages, Including High-Risk Clonal Complexes
title_fullStr Genome Informatics and Machine Learning-Based Identification of Antimicrobial Resistance-Encoding Features and Virulence Attributes in Escherichia coli Genomes Representing Globally Prevalent Lineages, Including High-Risk Clonal Complexes
title_full_unstemmed Genome Informatics and Machine Learning-Based Identification of Antimicrobial Resistance-Encoding Features and Virulence Attributes in Escherichia coli Genomes Representing Globally Prevalent Lineages, Including High-Risk Clonal Complexes
title_short Genome Informatics and Machine Learning-Based Identification of Antimicrobial Resistance-Encoding Features and Virulence Attributes in Escherichia coli Genomes Representing Globally Prevalent Lineages, Including High-Risk Clonal Complexes
title_sort genome informatics and machine learning-based identification of antimicrobial resistance-encoding features and virulence attributes in escherichia coli genomes representing globally prevalent lineages, including high-risk clonal complexes
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8844930/
https://www.ncbi.nlm.nih.gov/pubmed/35164570
http://dx.doi.org/10.1128/mbio.03796-21
work_keys_str_mv AT shaiksabiha genomeinformaticsandmachinelearningbasedidentificationofantimicrobialresistanceencodingfeaturesandvirulenceattributesinescherichiacoligenomesrepresentinggloballyprevalentlineagesincludinghighriskclonalcomplexes
AT singhanuradha genomeinformaticsandmachinelearningbasedidentificationofantimicrobialresistanceencodingfeaturesandvirulenceattributesinescherichiacoligenomesrepresentinggloballyprevalentlineagesincludinghighriskclonalcomplexes
AT suresharya genomeinformaticsandmachinelearningbasedidentificationofantimicrobialresistanceencodingfeaturesandvirulenceattributesinescherichiacoligenomesrepresentinggloballyprevalentlineagesincludinghighriskclonalcomplexes
AT ahmedniyaz genomeinformaticsandmachinelearningbasedidentificationofantimicrobialresistanceencodingfeaturesandvirulenceattributesinescherichiacoligenomesrepresentinggloballyprevalentlineagesincludinghighriskclonalcomplexes