Cargando…

Deconvoluting the diversity of within-host pathogen strains in a multi-locus sequence typing framework

BACKGROUND: Bacterial pathogens exhibit an impressive amount of genomic diversity. This diversity can be informative of evolutionary adaptations, host-pathogen interactions, and disease transmission patterns. However, capturing this diversity directly from biological samples is challenging. RESULTS:...

Descripción completa

Detalles Bibliográficos
Autores principales: Gan, Guo Liang, Willie, Elijah, Chauve, Cedric, Chindelevitch, Leonid
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6915855/
https://www.ncbi.nlm.nih.gov/pubmed/31842753
http://dx.doi.org/10.1186/s12859-019-3204-8
_version_ 1783480107337252864
author Gan, Guo Liang
Willie, Elijah
Chauve, Cedric
Chindelevitch, Leonid
author_facet Gan, Guo Liang
Willie, Elijah
Chauve, Cedric
Chindelevitch, Leonid
author_sort Gan, Guo Liang
collection PubMed
description BACKGROUND: Bacterial pathogens exhibit an impressive amount of genomic diversity. This diversity can be informative of evolutionary adaptations, host-pathogen interactions, and disease transmission patterns. However, capturing this diversity directly from biological samples is challenging. RESULTS: We introduce a framework for understanding the within-host diversity of a pathogen using multi-locus sequence types (MLST) from whole-genome sequencing (WGS) data. Our approach consists of two stages. First we process each sample individually by assigning it, for each locus in the MLST scheme, a set of alleles and a proportion for each allele. Next, we associate to each sample a set of strain types using the alleles and the strain proportions obtained in the first step. We achieve this by using the smallest possible number of previously unobserved strains across all samples, while using those unobserved strains which are as close to the observed ones as possible, at the same time respecting the allele proportions as closely as possible. We solve both problems using mixed integer linear programming (MILP). Our method performs accurately on simulated data and generates results on a real data set of Borrelia burgdorferi genomes suggesting a high level of diversity for this pathogen. CONCLUSIONS: Our approach can apply to any bacterial pathogen with an MLST scheme, even though we developed it with Borrelia burgdorferi, the etiological agent of Lyme disease, in mind. Our work paves the way for robust strain typing in the presence of within-host heterogeneity, overcoming an essential challenge currently not addressed by any existing methodology for pathogen genomics.
format Online
Article
Text
id pubmed-6915855
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-69158552019-12-30 Deconvoluting the diversity of within-host pathogen strains in a multi-locus sequence typing framework Gan, Guo Liang Willie, Elijah Chauve, Cedric Chindelevitch, Leonid BMC Bioinformatics Research BACKGROUND: Bacterial pathogens exhibit an impressive amount of genomic diversity. This diversity can be informative of evolutionary adaptations, host-pathogen interactions, and disease transmission patterns. However, capturing this diversity directly from biological samples is challenging. RESULTS: We introduce a framework for understanding the within-host diversity of a pathogen using multi-locus sequence types (MLST) from whole-genome sequencing (WGS) data. Our approach consists of two stages. First we process each sample individually by assigning it, for each locus in the MLST scheme, a set of alleles and a proportion for each allele. Next, we associate to each sample a set of strain types using the alleles and the strain proportions obtained in the first step. We achieve this by using the smallest possible number of previously unobserved strains across all samples, while using those unobserved strains which are as close to the observed ones as possible, at the same time respecting the allele proportions as closely as possible. We solve both problems using mixed integer linear programming (MILP). Our method performs accurately on simulated data and generates results on a real data set of Borrelia burgdorferi genomes suggesting a high level of diversity for this pathogen. CONCLUSIONS: Our approach can apply to any bacterial pathogen with an MLST scheme, even though we developed it with Borrelia burgdorferi, the etiological agent of Lyme disease, in mind. Our work paves the way for robust strain typing in the presence of within-host heterogeneity, overcoming an essential challenge currently not addressed by any existing methodology for pathogen genomics. BioMed Central 2019-12-17 /pmc/articles/PMC6915855/ /pubmed/31842753 http://dx.doi.org/10.1186/s12859-019-3204-8 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Gan, Guo Liang
Willie, Elijah
Chauve, Cedric
Chindelevitch, Leonid
Deconvoluting the diversity of within-host pathogen strains in a multi-locus sequence typing framework
title Deconvoluting the diversity of within-host pathogen strains in a multi-locus sequence typing framework
title_full Deconvoluting the diversity of within-host pathogen strains in a multi-locus sequence typing framework
title_fullStr Deconvoluting the diversity of within-host pathogen strains in a multi-locus sequence typing framework
title_full_unstemmed Deconvoluting the diversity of within-host pathogen strains in a multi-locus sequence typing framework
title_short Deconvoluting the diversity of within-host pathogen strains in a multi-locus sequence typing framework
title_sort deconvoluting the diversity of within-host pathogen strains in a multi-locus sequence typing framework
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6915855/
https://www.ncbi.nlm.nih.gov/pubmed/31842753
http://dx.doi.org/10.1186/s12859-019-3204-8
work_keys_str_mv AT ganguoliang deconvolutingthediversityofwithinhostpathogenstrainsinamultilocussequencetypingframework
AT willieelijah deconvolutingthediversityofwithinhostpathogenstrainsinamultilocussequencetypingframework
AT chauvecedric deconvolutingthediversityofwithinhostpathogenstrainsinamultilocussequencetypingframework
AT chindelevitchleonid deconvolutingthediversityofwithinhostpathogenstrainsinamultilocussequencetypingframework