Cargando…
ClassTR: Classifying Within-Host Heterogeneity Based on Tandem Repeats with Application to Mycobacterium tuberculosis Infections
Genomic tools have revealed genetically diverse pathogens within some hosts. Within-host pathogen diversity, which we refer to as “complex infection”, is increasingly recognized as a determinant of treatment outcome for infections like tuberculosis. Complex infection arises through two mechanisms: w...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4734664/ https://www.ncbi.nlm.nih.gov/pubmed/26829497 http://dx.doi.org/10.1371/journal.pcbi.1004475 |
_version_ | 1782412947259129856 |
---|---|
author | Chindelevitch, Leonid Colijn, Caroline Moodley, Prashini Wilson, Douglas Cohen, Ted |
author_facet | Chindelevitch, Leonid Colijn, Caroline Moodley, Prashini Wilson, Douglas Cohen, Ted |
author_sort | Chindelevitch, Leonid |
collection | PubMed |
description | Genomic tools have revealed genetically diverse pathogens within some hosts. Within-host pathogen diversity, which we refer to as “complex infection”, is increasingly recognized as a determinant of treatment outcome for infections like tuberculosis. Complex infection arises through two mechanisms: within-host mutation (which results in clonal heterogeneity) and reinfection (which results in mixed infections). Estimates of the frequency of within-host mutation and reinfection in populations are critical for understanding the natural history of disease. These estimates influence projections of disease trends and effects of interventions. The genotyping technique MLVA (multiple loci variable-number tandem repeats analysis) can identify complex infections, but the current method to distinguish clonal heterogeneity from mixed infections is based on a rather simple rule. Here we describe ClassTR, a method which leverages MLVA information from isolates collected in a population to distinguish mixed infections from clonal heterogeneity. We formulate the resolution of complex infections into their constituent strains as an optimization problem, and show its NP-completeness. We solve it efficiently by using mixed integer linear programming and graph decomposition. Once the complex infections are resolved into their constituent strains, ClassTR probabilistically classifies isolates as clonally heterogeneous or mixed by using a model of tandem repeat evolution. We first compare ClassTR with the standard rule-based classification on 100 simulated datasets. ClassTR outperforms the standard method, improving classification accuracy from 48% to 80%. We then apply ClassTR to a sample of 436 strains collected from tuberculosis patients in a South African community, of which 92 had complex infections. We find that ClassTR assigns an alternate classification to 18 of the 92 complex infections, suggesting important differences in practice. By explicitly modeling tandem repeat evolution, ClassTR helps to improve our understanding of the mechanisms driving within-host diversity of pathogens like Mycobacterium tuberculosis. |
format | Online Article Text |
id | pubmed-4734664 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-47346642016-02-04 ClassTR: Classifying Within-Host Heterogeneity Based on Tandem Repeats with Application to Mycobacterium tuberculosis Infections Chindelevitch, Leonid Colijn, Caroline Moodley, Prashini Wilson, Douglas Cohen, Ted PLoS Comput Biol Research Article Genomic tools have revealed genetically diverse pathogens within some hosts. Within-host pathogen diversity, which we refer to as “complex infection”, is increasingly recognized as a determinant of treatment outcome for infections like tuberculosis. Complex infection arises through two mechanisms: within-host mutation (which results in clonal heterogeneity) and reinfection (which results in mixed infections). Estimates of the frequency of within-host mutation and reinfection in populations are critical for understanding the natural history of disease. These estimates influence projections of disease trends and effects of interventions. The genotyping technique MLVA (multiple loci variable-number tandem repeats analysis) can identify complex infections, but the current method to distinguish clonal heterogeneity from mixed infections is based on a rather simple rule. Here we describe ClassTR, a method which leverages MLVA information from isolates collected in a population to distinguish mixed infections from clonal heterogeneity. We formulate the resolution of complex infections into their constituent strains as an optimization problem, and show its NP-completeness. We solve it efficiently by using mixed integer linear programming and graph decomposition. Once the complex infections are resolved into their constituent strains, ClassTR probabilistically classifies isolates as clonally heterogeneous or mixed by using a model of tandem repeat evolution. We first compare ClassTR with the standard rule-based classification on 100 simulated datasets. ClassTR outperforms the standard method, improving classification accuracy from 48% to 80%. We then apply ClassTR to a sample of 436 strains collected from tuberculosis patients in a South African community, of which 92 had complex infections. We find that ClassTR assigns an alternate classification to 18 of the 92 complex infections, suggesting important differences in practice. By explicitly modeling tandem repeat evolution, ClassTR helps to improve our understanding of the mechanisms driving within-host diversity of pathogens like Mycobacterium tuberculosis. Public Library of Science 2016-02-01 /pmc/articles/PMC4734664/ /pubmed/26829497 http://dx.doi.org/10.1371/journal.pcbi.1004475 Text en © 2016 Chindelevitch et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Chindelevitch, Leonid Colijn, Caroline Moodley, Prashini Wilson, Douglas Cohen, Ted ClassTR: Classifying Within-Host Heterogeneity Based on Tandem Repeats with Application to Mycobacterium tuberculosis Infections |
title | ClassTR: Classifying Within-Host Heterogeneity Based on Tandem Repeats with Application to Mycobacterium tuberculosis Infections |
title_full | ClassTR: Classifying Within-Host Heterogeneity Based on Tandem Repeats with Application to Mycobacterium tuberculosis Infections |
title_fullStr | ClassTR: Classifying Within-Host Heterogeneity Based on Tandem Repeats with Application to Mycobacterium tuberculosis Infections |
title_full_unstemmed | ClassTR: Classifying Within-Host Heterogeneity Based on Tandem Repeats with Application to Mycobacterium tuberculosis Infections |
title_short | ClassTR: Classifying Within-Host Heterogeneity Based on Tandem Repeats with Application to Mycobacterium tuberculosis Infections |
title_sort | classtr: classifying within-host heterogeneity based on tandem repeats with application to mycobacterium tuberculosis infections |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4734664/ https://www.ncbi.nlm.nih.gov/pubmed/26829497 http://dx.doi.org/10.1371/journal.pcbi.1004475 |
work_keys_str_mv | AT chindelevitchleonid classtrclassifyingwithinhostheterogeneitybasedontandemrepeatswithapplicationtomycobacteriumtuberculosisinfections AT colijncaroline classtrclassifyingwithinhostheterogeneitybasedontandemrepeatswithapplicationtomycobacteriumtuberculosisinfections AT moodleyprashini classtrclassifyingwithinhostheterogeneitybasedontandemrepeatswithapplicationtomycobacteriumtuberculosisinfections AT wilsondouglas classtrclassifyingwithinhostheterogeneitybasedontandemrepeatswithapplicationtomycobacteriumtuberculosisinfections AT cohented classtrclassifyingwithinhostheterogeneitybasedontandemrepeatswithapplicationtomycobacteriumtuberculosisinfections |