Cargando…

The impact of frequently neglected model violations on bacterial recombination rate estimation: a case study in Mycobacterium canettii and Mycobacterium tuberculosis

Mycobacterium canettii is a causative agent of tuberculosis in humans, along with the members of the Mycobacterium tuberculosis complex. Frequently used as an outgroup to the M. tuberculosis complex in phylogenetic analyses, M. canettii is thought to offer the best proxy for the progenitor species t...

Descripción completa

Detalles Bibliográficos
Autores principales: Sabin, Susanna, Morales-Arce, Ana Y, Pfeifer, Susanne P, Jensen, Jeffrey D
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9073693/
https://www.ncbi.nlm.nih.gov/pubmed/35253851
http://dx.doi.org/10.1093/g3journal/jkac055
Descripción
Sumario:Mycobacterium canettii is a causative agent of tuberculosis in humans, along with the members of the Mycobacterium tuberculosis complex. Frequently used as an outgroup to the M. tuberculosis complex in phylogenetic analyses, M. canettii is thought to offer the best proxy for the progenitor species that gave rise to the complex. Here, we leverage whole-genome sequencing data and biologically relevant population genomic models to compare the evolutionary dynamics driving variation in the recombining M. canettii with that in the nonrecombining M. tuberculosis complex, and discuss differences in observed genomic diversity in the light of expected levels of Hill–Robertson interference. In doing so, we highlight the methodological challenges of estimating recombination rates through traditional population genetic approaches using sequences called from populations of microorganisms and evaluate the likely mis-inference that arises owing to a neglect of common model violations including purifying selection, background selection, progeny skew, and population size change. In addition, we compare performance when full within-host polymorphism data are utilized, versus the more common approach of basing analyses on within-host consensus sequences.