Cargando…

ROCker Models for Reliable Detection and Typing of Short-Read Sequences Carrying β-Lactamase Genes

Identification of genes encoding β-lactamases (BLs) from short-read sequences remains challenging due to the high frequency of shared amino acid functional domains and motifs in proteins encoded by BL genes and related non-BL gene sequences. Divergent BL homologs can be frequently missed during simi...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Si-Yu, Suttner, Brittany, Rodriguez-R, Luis M., Orellana, Luis H., Conrad, Roth E., Liu, Fang, Rowell, Jessica L., Webb, Hattie E., Williams-Newkirk, Amanda J., Huang, Andrew, Konstantinidis, Konstantinos T.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society for Microbiology 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9238382/
https://www.ncbi.nlm.nih.gov/pubmed/35638728
http://dx.doi.org/10.1128/msystems.01281-21
_version_ 1784737037533839360
author Zhang, Si-Yu
Suttner, Brittany
Rodriguez-R, Luis M.
Orellana, Luis H.
Conrad, Roth E.
Liu, Fang
Rowell, Jessica L.
Webb, Hattie E.
Williams-Newkirk, Amanda J.
Huang, Andrew
Konstantinidis, Konstantinos T.
author_facet Zhang, Si-Yu
Suttner, Brittany
Rodriguez-R, Luis M.
Orellana, Luis H.
Conrad, Roth E.
Liu, Fang
Rowell, Jessica L.
Webb, Hattie E.
Williams-Newkirk, Amanda J.
Huang, Andrew
Konstantinidis, Konstantinos T.
author_sort Zhang, Si-Yu
collection PubMed
description Identification of genes encoding β-lactamases (BLs) from short-read sequences remains challenging due to the high frequency of shared amino acid functional domains and motifs in proteins encoded by BL genes and related non-BL gene sequences. Divergent BL homologs can be frequently missed during similarity searches, which has important practical consequences for monitoring antibiotic resistance. To address this limitation, we built ROCker models that targeted broad classes (e.g., class A, B, C, and D) and individual families (e.g., TEM) of BLs and challenged them with mock 150-bp- and 250-bp-read data sets of known composition. ROCker identifies most-discriminant bit score thresholds in sliding windows along the sequence of the target protein sequence and hence can account for nondiscriminative domains shared by unrelated proteins. BL ROCker models showed a 0% false-positive rate (FPR), a 0% to 4% false-negative rate (FNR), and an up-to-50-fold-higher F1 score [2 × precision × recall/(precision + recall)] compared to alternative methods, such as similarity searches using BLASTx with various e-value thresholds and BL hidden Markov models, or tools like DeepARG, ShortBRED, and AMRFinder. The ROCker models and the underlying protein sequence reference data sets and phylogenetic trees for read placement are freely available through http://enve-omics.ce.gatech.edu/data/rocker-bla. Application of these BL ROCker models to metagenomics, metatranscriptomics, and high-throughput PCR gene amplicon data should facilitate the reliable detection and quantification of BL variants encoded by environmental or clinical isolates and microbiomes and more accurate assessment of the associated public health risk, compared to the current practice. IMPORTANCE Resistance genes encoding β-lactamases (BLs) confer resistance to the widely prescribed antibiotic class β-lactams. Therefore, it is important to assess the prevalence of BL genes in clinical or environmental samples for monitoring the spreading of these genes into pathogens and estimating public health risk. However, detecting BLs in short-read sequence data is technically challenging. Our ROCker model-based bioinformatics approach showcases the reliable detection and typing of BLs in complex data sets and thus contributes toward solving an important problem in antibiotic resistance surveillance. The ROCker models developed substantially expand the toolbox for monitoring antibiotic resistance in clinical or environmental settings.
format Online
Article
Text
id pubmed-9238382
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher American Society for Microbiology
record_format MEDLINE/PubMed
spelling pubmed-92383822022-06-29 ROCker Models for Reliable Detection and Typing of Short-Read Sequences Carrying β-Lactamase Genes Zhang, Si-Yu Suttner, Brittany Rodriguez-R, Luis M. Orellana, Luis H. Conrad, Roth E. Liu, Fang Rowell, Jessica L. Webb, Hattie E. Williams-Newkirk, Amanda J. Huang, Andrew Konstantinidis, Konstantinos T. mSystems Methods and Protocols Identification of genes encoding β-lactamases (BLs) from short-read sequences remains challenging due to the high frequency of shared amino acid functional domains and motifs in proteins encoded by BL genes and related non-BL gene sequences. Divergent BL homologs can be frequently missed during similarity searches, which has important practical consequences for monitoring antibiotic resistance. To address this limitation, we built ROCker models that targeted broad classes (e.g., class A, B, C, and D) and individual families (e.g., TEM) of BLs and challenged them with mock 150-bp- and 250-bp-read data sets of known composition. ROCker identifies most-discriminant bit score thresholds in sliding windows along the sequence of the target protein sequence and hence can account for nondiscriminative domains shared by unrelated proteins. BL ROCker models showed a 0% false-positive rate (FPR), a 0% to 4% false-negative rate (FNR), and an up-to-50-fold-higher F1 score [2 × precision × recall/(precision + recall)] compared to alternative methods, such as similarity searches using BLASTx with various e-value thresholds and BL hidden Markov models, or tools like DeepARG, ShortBRED, and AMRFinder. The ROCker models and the underlying protein sequence reference data sets and phylogenetic trees for read placement are freely available through http://enve-omics.ce.gatech.edu/data/rocker-bla. Application of these BL ROCker models to metagenomics, metatranscriptomics, and high-throughput PCR gene amplicon data should facilitate the reliable detection and quantification of BL variants encoded by environmental or clinical isolates and microbiomes and more accurate assessment of the associated public health risk, compared to the current practice. IMPORTANCE Resistance genes encoding β-lactamases (BLs) confer resistance to the widely prescribed antibiotic class β-lactams. Therefore, it is important to assess the prevalence of BL genes in clinical or environmental samples for monitoring the spreading of these genes into pathogens and estimating public health risk. However, detecting BLs in short-read sequence data is technically challenging. Our ROCker model-based bioinformatics approach showcases the reliable detection and typing of BLs in complex data sets and thus contributes toward solving an important problem in antibiotic resistance surveillance. The ROCker models developed substantially expand the toolbox for monitoring antibiotic resistance in clinical or environmental settings. American Society for Microbiology 2022-05-31 /pmc/articles/PMC9238382/ /pubmed/35638728 http://dx.doi.org/10.1128/msystems.01281-21 Text en Copyright © 2022 Zhang et al. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Methods and Protocols
Zhang, Si-Yu
Suttner, Brittany
Rodriguez-R, Luis M.
Orellana, Luis H.
Conrad, Roth E.
Liu, Fang
Rowell, Jessica L.
Webb, Hattie E.
Williams-Newkirk, Amanda J.
Huang, Andrew
Konstantinidis, Konstantinos T.
ROCker Models for Reliable Detection and Typing of Short-Read Sequences Carrying β-Lactamase Genes
title ROCker Models for Reliable Detection and Typing of Short-Read Sequences Carrying β-Lactamase Genes
title_full ROCker Models for Reliable Detection and Typing of Short-Read Sequences Carrying β-Lactamase Genes
title_fullStr ROCker Models for Reliable Detection and Typing of Short-Read Sequences Carrying β-Lactamase Genes
title_full_unstemmed ROCker Models for Reliable Detection and Typing of Short-Read Sequences Carrying β-Lactamase Genes
title_short ROCker Models for Reliable Detection and Typing of Short-Read Sequences Carrying β-Lactamase Genes
title_sort rocker models for reliable detection and typing of short-read sequences carrying β-lactamase genes
topic Methods and Protocols
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9238382/
https://www.ncbi.nlm.nih.gov/pubmed/35638728
http://dx.doi.org/10.1128/msystems.01281-21
work_keys_str_mv AT zhangsiyu rockermodelsforreliabledetectionandtypingofshortreadsequencescarryingblactamasegenes
AT suttnerbrittany rockermodelsforreliabledetectionandtypingofshortreadsequencescarryingblactamasegenes
AT rodriguezrluism rockermodelsforreliabledetectionandtypingofshortreadsequencescarryingblactamasegenes
AT orellanaluish rockermodelsforreliabledetectionandtypingofshortreadsequencescarryingblactamasegenes
AT conradrothe rockermodelsforreliabledetectionandtypingofshortreadsequencescarryingblactamasegenes
AT liufang rockermodelsforreliabledetectionandtypingofshortreadsequencescarryingblactamasegenes
AT rowelljessical rockermodelsforreliabledetectionandtypingofshortreadsequencescarryingblactamasegenes
AT webbhattiee rockermodelsforreliabledetectionandtypingofshortreadsequencescarryingblactamasegenes
AT williamsnewkirkamandaj rockermodelsforreliabledetectionandtypingofshortreadsequencescarryingblactamasegenes
AT huangandrew rockermodelsforreliabledetectionandtypingofshortreadsequencescarryingblactamasegenes
AT konstantinidiskonstantinost rockermodelsforreliabledetectionandtypingofshortreadsequencescarryingblactamasegenes