Cargando…

Comparison of Whole Genome (wg-) and Core Genome (cg-) MLST (BioNumerics(TM)) Versus SNP Variant Calling for Epidemiological Investigation of Pseudomonas aeruginosa

Whole genome sequencing (WGS) is increasingly used for epidemiological investigations of pathogens. While SNP variant calling is currently considered as the most suitable method, the choice of a representative reference genome and the isolate dependency of results limit standardization and affect re...

Descripción completa

Detalles Bibliográficos
Autores principales: Blanc, Dominique S., Magalhães, Bárbara, Koenig, Isabelle, Senn, Laurence, Grandbastien, Bruno
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7387498/
https://www.ncbi.nlm.nih.gov/pubmed/32793169
http://dx.doi.org/10.3389/fmicb.2020.01729
_version_ 1783564133455626240
author Blanc, Dominique S.
Magalhães, Bárbara
Koenig, Isabelle
Senn, Laurence
Grandbastien, Bruno
author_facet Blanc, Dominique S.
Magalhães, Bárbara
Koenig, Isabelle
Senn, Laurence
Grandbastien, Bruno
author_sort Blanc, Dominique S.
collection PubMed
description Whole genome sequencing (WGS) is increasingly used for epidemiological investigations of pathogens. While SNP variant calling is currently considered as the most suitable method, the choice of a representative reference genome and the isolate dependency of results limit standardization and affect resolution in an unknown manner. Whole or core genome Multi Locus Sequence Typing (wg-, cg-MLST) represents an attractive alternative. Here, we assess the accuracy of wg- and cg-MLST by comparing results of four Pseudomonas aeruginosa datasets for which epidemiological and genomic data were previously described. Three datasets included 155 isolates from three different sequence types (ST) of P. aeruginosa collected in our ICUs over a 5-year period. The fourth dataset consisted of 10 isolates from an investigation of P. aeruginosa contaminated hand soap. All isolates were previously analyzed by a core SNP approach. In this study, wg- and cg-MLST were performed in BioNumerics(TM) using a scheme developed by Applied-Maths. Correlation between SNP calling and wg- or cg-MLST results were evaluated by calculating linear regressions and their coefficient of correlations (R(2)) between the number of SNPs and the number of allele differences in pairwise comparison of isolates. The number of SNPs and allele difference between isolates with close epidemiological linkage varies between 0–26 and 0–13, respectively. When compared to core-SNP calling, a higher coefficient of correlation was obtained with cgMLST (R(2) of 0.92–0.99) than with wgMLST (0.78–0.99). In one dataset, a putative homologous recombination of a large DNA fragment (202 loci) was identified among these isolates, affecting its phylogeny, but with no impact on the epidemiological analysis of outbreak isolates. In conclusion, we showed that the P. aeruginosa wgMLST scheme in BioNumerics(TM) is as discriminatory as the core-SNP calling approach and apparently useful for outbreak investigations. We also showed that epidemiological linked isolates showed less than 26 SNPs or 13 allele differences. These are important figures for the distinction between outbreak and non-outbreak isolates when interpreting WGS results. However, as P. aeruginosa is highly recombinant, a cgMLST approach is preferable and caution should be addressed to possible recombination of large DNA fragments.
format Online
Article
Text
id pubmed-7387498
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-73874982020-08-12 Comparison of Whole Genome (wg-) and Core Genome (cg-) MLST (BioNumerics(TM)) Versus SNP Variant Calling for Epidemiological Investigation of Pseudomonas aeruginosa Blanc, Dominique S. Magalhães, Bárbara Koenig, Isabelle Senn, Laurence Grandbastien, Bruno Front Microbiol Microbiology Whole genome sequencing (WGS) is increasingly used for epidemiological investigations of pathogens. While SNP variant calling is currently considered as the most suitable method, the choice of a representative reference genome and the isolate dependency of results limit standardization and affect resolution in an unknown manner. Whole or core genome Multi Locus Sequence Typing (wg-, cg-MLST) represents an attractive alternative. Here, we assess the accuracy of wg- and cg-MLST by comparing results of four Pseudomonas aeruginosa datasets for which epidemiological and genomic data were previously described. Three datasets included 155 isolates from three different sequence types (ST) of P. aeruginosa collected in our ICUs over a 5-year period. The fourth dataset consisted of 10 isolates from an investigation of P. aeruginosa contaminated hand soap. All isolates were previously analyzed by a core SNP approach. In this study, wg- and cg-MLST were performed in BioNumerics(TM) using a scheme developed by Applied-Maths. Correlation between SNP calling and wg- or cg-MLST results were evaluated by calculating linear regressions and their coefficient of correlations (R(2)) between the number of SNPs and the number of allele differences in pairwise comparison of isolates. The number of SNPs and allele difference between isolates with close epidemiological linkage varies between 0–26 and 0–13, respectively. When compared to core-SNP calling, a higher coefficient of correlation was obtained with cgMLST (R(2) of 0.92–0.99) than with wgMLST (0.78–0.99). In one dataset, a putative homologous recombination of a large DNA fragment (202 loci) was identified among these isolates, affecting its phylogeny, but with no impact on the epidemiological analysis of outbreak isolates. In conclusion, we showed that the P. aeruginosa wgMLST scheme in BioNumerics(TM) is as discriminatory as the core-SNP calling approach and apparently useful for outbreak investigations. We also showed that epidemiological linked isolates showed less than 26 SNPs or 13 allele differences. These are important figures for the distinction between outbreak and non-outbreak isolates when interpreting WGS results. However, as P. aeruginosa is highly recombinant, a cgMLST approach is preferable and caution should be addressed to possible recombination of large DNA fragments. Frontiers Media S.A. 2020-07-22 /pmc/articles/PMC7387498/ /pubmed/32793169 http://dx.doi.org/10.3389/fmicb.2020.01729 Text en Copyright © 2020 Blanc, Magalhães, Koenig, Senn and Grandbastien. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Microbiology
Blanc, Dominique S.
Magalhães, Bárbara
Koenig, Isabelle
Senn, Laurence
Grandbastien, Bruno
Comparison of Whole Genome (wg-) and Core Genome (cg-) MLST (BioNumerics(TM)) Versus SNP Variant Calling for Epidemiological Investigation of Pseudomonas aeruginosa
title Comparison of Whole Genome (wg-) and Core Genome (cg-) MLST (BioNumerics(TM)) Versus SNP Variant Calling for Epidemiological Investigation of Pseudomonas aeruginosa
title_full Comparison of Whole Genome (wg-) and Core Genome (cg-) MLST (BioNumerics(TM)) Versus SNP Variant Calling for Epidemiological Investigation of Pseudomonas aeruginosa
title_fullStr Comparison of Whole Genome (wg-) and Core Genome (cg-) MLST (BioNumerics(TM)) Versus SNP Variant Calling for Epidemiological Investigation of Pseudomonas aeruginosa
title_full_unstemmed Comparison of Whole Genome (wg-) and Core Genome (cg-) MLST (BioNumerics(TM)) Versus SNP Variant Calling for Epidemiological Investigation of Pseudomonas aeruginosa
title_short Comparison of Whole Genome (wg-) and Core Genome (cg-) MLST (BioNumerics(TM)) Versus SNP Variant Calling for Epidemiological Investigation of Pseudomonas aeruginosa
title_sort comparison of whole genome (wg-) and core genome (cg-) mlst (bionumerics(tm)) versus snp variant calling for epidemiological investigation of pseudomonas aeruginosa
topic Microbiology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7387498/
https://www.ncbi.nlm.nih.gov/pubmed/32793169
http://dx.doi.org/10.3389/fmicb.2020.01729
work_keys_str_mv AT blancdominiques comparisonofwholegenomewgandcoregenomecgmlstbionumericstmversussnpvariantcallingforepidemiologicalinvestigationofpseudomonasaeruginosa
AT magalhaesbarbara comparisonofwholegenomewgandcoregenomecgmlstbionumericstmversussnpvariantcallingforepidemiologicalinvestigationofpseudomonasaeruginosa
AT koenigisabelle comparisonofwholegenomewgandcoregenomecgmlstbionumericstmversussnpvariantcallingforepidemiologicalinvestigationofpseudomonasaeruginosa
AT sennlaurence comparisonofwholegenomewgandcoregenomecgmlstbionumericstmversussnpvariantcallingforepidemiologicalinvestigationofpseudomonasaeruginosa
AT grandbastienbruno comparisonofwholegenomewgandcoregenomecgmlstbionumericstmversussnpvariantcallingforepidemiologicalinvestigationofpseudomonasaeruginosa