Cargando…

Source Attribution of Human Campylobacteriosis Using Whole-Genome Sequencing Data and Network Analysis

Campylobacter spp. are a leading and increasing cause of gastrointestinal infections worldwide. Source attribution, which apportions human infection cases to different animal species and food reservoirs, has been instrumental in control- and evidence-based intervention efforts. The rapid increase in...

Descripción completa

Detalles Bibliográficos
Autores principales: Wainaina, Lynda, Merlotti, Alessandra, Remondini, Daniel, Henri, Clementine, Hald, Tine, Njage, Patrick Murigu Kamau
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9229307/
https://www.ncbi.nlm.nih.gov/pubmed/35745499
http://dx.doi.org/10.3390/pathogens11060645
_version_ 1784734711452532736
author Wainaina, Lynda
Merlotti, Alessandra
Remondini, Daniel
Henri, Clementine
Hald, Tine
Njage, Patrick Murigu Kamau
author_facet Wainaina, Lynda
Merlotti, Alessandra
Remondini, Daniel
Henri, Clementine
Hald, Tine
Njage, Patrick Murigu Kamau
author_sort Wainaina, Lynda
collection PubMed
description Campylobacter spp. are a leading and increasing cause of gastrointestinal infections worldwide. Source attribution, which apportions human infection cases to different animal species and food reservoirs, has been instrumental in control- and evidence-based intervention efforts. The rapid increase in whole-genome sequencing data provides an opportunity for higher-resolution source attribution models. Important challenges, including the high dimension and complex structure of WGS data, have inspired concerted research efforts to develop new models. We propose network analysis models as an accurate, high-resolution source attribution approach for the sources of human campylobacteriosis. A weighted network analysis approach was used in this study for source attribution comparing different WGS data inputs. The compared model inputs consisted of cgMLST and wgMLST distance matrices from 717 human and 717 animal isolates from cattle, chickens, dogs, ducks, pigs and turkeys. SNP distance matrices from 720 human and 720 animal isolates were also used. The data were collected from 2015 to 2017 in Denmark, with the animal sources consisting of domestic and imports from 7 European countries. Clusters consisted of network nodes representing respective genomes and links representing distances between genomes. Based on the results, animal sources were the main driving factor for cluster formation, followed by type of species and sampling year. The coherence source clustering (CSC) values based on animal sources were [Formula: see text] , [Formula: see text] and [Formula: see text] for cgMLST, wgMLST and SNP, respectively. The CSC values based on Campylobacter species were [Formula: see text] , [Formula: see text] and [Formula: see text] for cgMLST, wgMLST and SNP, respectively. Including human isolates in the network resulted in [Formula: see text] , [Formula: see text] and [Formula: see text] of the total human isolates being clustered with the different animal sources for cgMLST, wgMLST and SNP, respectively. Between [Formula: see text] and [Formula: see text] of human isolates were not attributed to any animal source. Most of the human genomes were attributed to chickens from Denmark, with an average attribution percentage of [Formula: see text] , [Formula: see text] and [Formula: see text] for cgMLST, wgMLST and SNP distance matrices respectively, while ducks from Denmark showed the least attribution of [Formula: see text] for all three distance matrices. The best-performing model was the one using wgMLST distance matrix as input data, which had a CSC value of [Formula: see text]. Results from our study show that the weighted network-based approach for source attribution is reliable and can be used as an alternative method for source attribution considering the high performance of the model. The model is also robust across the different Campylobacter species, animal sources and WGS data types used as input.
format Online
Article
Text
id pubmed-9229307
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-92293072022-06-25 Source Attribution of Human Campylobacteriosis Using Whole-Genome Sequencing Data and Network Analysis Wainaina, Lynda Merlotti, Alessandra Remondini, Daniel Henri, Clementine Hald, Tine Njage, Patrick Murigu Kamau Pathogens Article Campylobacter spp. are a leading and increasing cause of gastrointestinal infections worldwide. Source attribution, which apportions human infection cases to different animal species and food reservoirs, has been instrumental in control- and evidence-based intervention efforts. The rapid increase in whole-genome sequencing data provides an opportunity for higher-resolution source attribution models. Important challenges, including the high dimension and complex structure of WGS data, have inspired concerted research efforts to develop new models. We propose network analysis models as an accurate, high-resolution source attribution approach for the sources of human campylobacteriosis. A weighted network analysis approach was used in this study for source attribution comparing different WGS data inputs. The compared model inputs consisted of cgMLST and wgMLST distance matrices from 717 human and 717 animal isolates from cattle, chickens, dogs, ducks, pigs and turkeys. SNP distance matrices from 720 human and 720 animal isolates were also used. The data were collected from 2015 to 2017 in Denmark, with the animal sources consisting of domestic and imports from 7 European countries. Clusters consisted of network nodes representing respective genomes and links representing distances between genomes. Based on the results, animal sources were the main driving factor for cluster formation, followed by type of species and sampling year. The coherence source clustering (CSC) values based on animal sources were [Formula: see text] , [Formula: see text] and [Formula: see text] for cgMLST, wgMLST and SNP, respectively. The CSC values based on Campylobacter species were [Formula: see text] , [Formula: see text] and [Formula: see text] for cgMLST, wgMLST and SNP, respectively. Including human isolates in the network resulted in [Formula: see text] , [Formula: see text] and [Formula: see text] of the total human isolates being clustered with the different animal sources for cgMLST, wgMLST and SNP, respectively. Between [Formula: see text] and [Formula: see text] of human isolates were not attributed to any animal source. Most of the human genomes were attributed to chickens from Denmark, with an average attribution percentage of [Formula: see text] , [Formula: see text] and [Formula: see text] for cgMLST, wgMLST and SNP distance matrices respectively, while ducks from Denmark showed the least attribution of [Formula: see text] for all three distance matrices. The best-performing model was the one using wgMLST distance matrix as input data, which had a CSC value of [Formula: see text]. Results from our study show that the weighted network-based approach for source attribution is reliable and can be used as an alternative method for source attribution considering the high performance of the model. The model is also robust across the different Campylobacter species, animal sources and WGS data types used as input. MDPI 2022-06-03 /pmc/articles/PMC9229307/ /pubmed/35745499 http://dx.doi.org/10.3390/pathogens11060645 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Wainaina, Lynda
Merlotti, Alessandra
Remondini, Daniel
Henri, Clementine
Hald, Tine
Njage, Patrick Murigu Kamau
Source Attribution of Human Campylobacteriosis Using Whole-Genome Sequencing Data and Network Analysis
title Source Attribution of Human Campylobacteriosis Using Whole-Genome Sequencing Data and Network Analysis
title_full Source Attribution of Human Campylobacteriosis Using Whole-Genome Sequencing Data and Network Analysis
title_fullStr Source Attribution of Human Campylobacteriosis Using Whole-Genome Sequencing Data and Network Analysis
title_full_unstemmed Source Attribution of Human Campylobacteriosis Using Whole-Genome Sequencing Data and Network Analysis
title_short Source Attribution of Human Campylobacteriosis Using Whole-Genome Sequencing Data and Network Analysis
title_sort source attribution of human campylobacteriosis using whole-genome sequencing data and network analysis
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9229307/
https://www.ncbi.nlm.nih.gov/pubmed/35745499
http://dx.doi.org/10.3390/pathogens11060645
work_keys_str_mv AT wainainalynda sourceattributionofhumancampylobacteriosisusingwholegenomesequencingdataandnetworkanalysis
AT merlottialessandra sourceattributionofhumancampylobacteriosisusingwholegenomesequencingdataandnetworkanalysis
AT remondinidaniel sourceattributionofhumancampylobacteriosisusingwholegenomesequencingdataandnetworkanalysis
AT henriclementine sourceattributionofhumancampylobacteriosisusingwholegenomesequencingdataandnetworkanalysis
AT haldtine sourceattributionofhumancampylobacteriosisusingwholegenomesequencingdataandnetworkanalysis
AT njagepatrickmurigukamau sourceattributionofhumancampylobacteriosisusingwholegenomesequencingdataandnetworkanalysis