Cargando…

A Facile machine learning multi-classification model for Streptococcus agalactiae clonal complexes

BACKGROUND: The clinical significance of group B streptococcus (GBS) was different among different clonal complexes (CCs), accurate strain typing of GBS would facilitate clinical prognostic evaluation, epidemiological investigation and infection control. The aim of this study was to construct a prac...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Jingxian, Zhao, Jing, Huang, Chencui, Xu, Jingxu, Liu, Wei, Yu, Jiajia, Guan, Hongyan, Liu, Ying, Shen, Lisong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9675200/
https://www.ncbi.nlm.nih.gov/pubmed/36401296
http://dx.doi.org/10.1186/s12941-022-00541-3
_version_ 1784833319509164032
author Liu, Jingxian
Zhao, Jing
Huang, Chencui
Xu, Jingxu
Liu, Wei
Yu, Jiajia
Guan, Hongyan
Liu, Ying
Shen, Lisong
author_facet Liu, Jingxian
Zhao, Jing
Huang, Chencui
Xu, Jingxu
Liu, Wei
Yu, Jiajia
Guan, Hongyan
Liu, Ying
Shen, Lisong
author_sort Liu, Jingxian
collection PubMed
description BACKGROUND: The clinical significance of group B streptococcus (GBS) was different among different clonal complexes (CCs), accurate strain typing of GBS would facilitate clinical prognostic evaluation, epidemiological investigation and infection control. The aim of this study was to construct a practical and facile CCs prediction model for S. agalactiae. METHODS: A total of 325 non-duplicated GBS strains were collected from clinical samples in Xinhua Hospital, Shanghai, China. Multilocus sequence typing (MLST) method was used for molecular classification, the results were analyzed to derive CCs by Bionumeric 8.0 software. Antibiotic susceptibility test was performed using Vitek-2 Compact system combined with K-B method. Multiplex PCR method was used for serotype identification. A total of 45 virulence genes associated with adhesion, invasion, immune evasion were detected by PCR method and electrophoresis. Three types of features, including antibiotic susceptibility (A), serotypes (S) and virulence genes (V) tests, and XGBoost algorithm was established to develop multi-class CCs identification models. The performance of proposed models was evaluated by the receiver operating characteristic curve (ROC). RESULTS: The 325 GBS were divided into 47 STs, and then calculated into 7 major CCs, including CC1, CC10, CC12, CC17, CC19, CC23, CC24. A total of 18 features in three kinds of tests (A, S, V) were significantly different from each CC. The model based on all the features (S&A&V) performed best with AUC 0.9536. The model based on serotype and antibiotic resistance (S&A) only enrolled 5 weighed features, performed well in predicting CCs with mean AUC 0.9212, and had no statistical difference in predicting CC10, CC12, CC17, CC19, CC23 and CC24 when compared with S&A&V model (all p > 0.05). CONCLUSIONS: The S&A model requires least parameters while maintaining a high accuracy and predictive power of CCs prediction. The established model could be used as a promising tool to classify the GBS molecular types, and suggests a substantive improvement in clinical application and epidemiology surveillance in GBS phenotyping. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12941-022-00541-3.
format Online
Article
Text
id pubmed-9675200
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-96752002022-11-20 A Facile machine learning multi-classification model for Streptococcus agalactiae clonal complexes Liu, Jingxian Zhao, Jing Huang, Chencui Xu, Jingxu Liu, Wei Yu, Jiajia Guan, Hongyan Liu, Ying Shen, Lisong Ann Clin Microbiol Antimicrob Research BACKGROUND: The clinical significance of group B streptococcus (GBS) was different among different clonal complexes (CCs), accurate strain typing of GBS would facilitate clinical prognostic evaluation, epidemiological investigation and infection control. The aim of this study was to construct a practical and facile CCs prediction model for S. agalactiae. METHODS: A total of 325 non-duplicated GBS strains were collected from clinical samples in Xinhua Hospital, Shanghai, China. Multilocus sequence typing (MLST) method was used for molecular classification, the results were analyzed to derive CCs by Bionumeric 8.0 software. Antibiotic susceptibility test was performed using Vitek-2 Compact system combined with K-B method. Multiplex PCR method was used for serotype identification. A total of 45 virulence genes associated with adhesion, invasion, immune evasion were detected by PCR method and electrophoresis. Three types of features, including antibiotic susceptibility (A), serotypes (S) and virulence genes (V) tests, and XGBoost algorithm was established to develop multi-class CCs identification models. The performance of proposed models was evaluated by the receiver operating characteristic curve (ROC). RESULTS: The 325 GBS were divided into 47 STs, and then calculated into 7 major CCs, including CC1, CC10, CC12, CC17, CC19, CC23, CC24. A total of 18 features in three kinds of tests (A, S, V) were significantly different from each CC. The model based on all the features (S&A&V) performed best with AUC 0.9536. The model based on serotype and antibiotic resistance (S&A) only enrolled 5 weighed features, performed well in predicting CCs with mean AUC 0.9212, and had no statistical difference in predicting CC10, CC12, CC17, CC19, CC23 and CC24 when compared with S&A&V model (all p > 0.05). CONCLUSIONS: The S&A model requires least parameters while maintaining a high accuracy and predictive power of CCs prediction. The established model could be used as a promising tool to classify the GBS molecular types, and suggests a substantive improvement in clinical application and epidemiology surveillance in GBS phenotyping. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12941-022-00541-3. BioMed Central 2022-11-18 /pmc/articles/PMC9675200/ /pubmed/36401296 http://dx.doi.org/10.1186/s12941-022-00541-3 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Liu, Jingxian
Zhao, Jing
Huang, Chencui
Xu, Jingxu
Liu, Wei
Yu, Jiajia
Guan, Hongyan
Liu, Ying
Shen, Lisong
A Facile machine learning multi-classification model for Streptococcus agalactiae clonal complexes
title A Facile machine learning multi-classification model for Streptococcus agalactiae clonal complexes
title_full A Facile machine learning multi-classification model for Streptococcus agalactiae clonal complexes
title_fullStr A Facile machine learning multi-classification model for Streptococcus agalactiae clonal complexes
title_full_unstemmed A Facile machine learning multi-classification model for Streptococcus agalactiae clonal complexes
title_short A Facile machine learning multi-classification model for Streptococcus agalactiae clonal complexes
title_sort facile machine learning multi-classification model for streptococcus agalactiae clonal complexes
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9675200/
https://www.ncbi.nlm.nih.gov/pubmed/36401296
http://dx.doi.org/10.1186/s12941-022-00541-3
work_keys_str_mv AT liujingxian afacilemachinelearningmulticlassificationmodelforstreptococcusagalactiaeclonalcomplexes
AT zhaojing afacilemachinelearningmulticlassificationmodelforstreptococcusagalactiaeclonalcomplexes
AT huangchencui afacilemachinelearningmulticlassificationmodelforstreptococcusagalactiaeclonalcomplexes
AT xujingxu afacilemachinelearningmulticlassificationmodelforstreptococcusagalactiaeclonalcomplexes
AT liuwei afacilemachinelearningmulticlassificationmodelforstreptococcusagalactiaeclonalcomplexes
AT yujiajia afacilemachinelearningmulticlassificationmodelforstreptococcusagalactiaeclonalcomplexes
AT guanhongyan afacilemachinelearningmulticlassificationmodelforstreptococcusagalactiaeclonalcomplexes
AT liuying afacilemachinelearningmulticlassificationmodelforstreptococcusagalactiaeclonalcomplexes
AT shenlisong afacilemachinelearningmulticlassificationmodelforstreptococcusagalactiaeclonalcomplexes
AT liujingxian facilemachinelearningmulticlassificationmodelforstreptococcusagalactiaeclonalcomplexes
AT zhaojing facilemachinelearningmulticlassificationmodelforstreptococcusagalactiaeclonalcomplexes
AT huangchencui facilemachinelearningmulticlassificationmodelforstreptococcusagalactiaeclonalcomplexes
AT xujingxu facilemachinelearningmulticlassificationmodelforstreptococcusagalactiaeclonalcomplexes
AT liuwei facilemachinelearningmulticlassificationmodelforstreptococcusagalactiaeclonalcomplexes
AT yujiajia facilemachinelearningmulticlassificationmodelforstreptococcusagalactiaeclonalcomplexes
AT guanhongyan facilemachinelearningmulticlassificationmodelforstreptococcusagalactiaeclonalcomplexes
AT liuying facilemachinelearningmulticlassificationmodelforstreptococcusagalactiaeclonalcomplexes
AT shenlisong facilemachinelearningmulticlassificationmodelforstreptococcusagalactiaeclonalcomplexes