Cargando…

Mycobacterium tuberculosis complex lineage 5 exhibits high levels of within-lineage genomic diversity and differing gene content compared to the type strain H37Rv

Pathogens of the Mycobacterium tuberculosis complex (MTBC) are considered to be monomorphic, with little gene content variation between strains. Nevertheless, several genotypic and phenotypic factors separate strains of the different MTBC lineages (L), especially L5 and L6 (traditionally termed Myco...

Descripción completa

Detalles Bibliográficos
Autores principales: Sanoussi, C. N'Dira, Coscolla, Mireia, Ofori-Anyinam, Boatema, Otchere, Isaac Darko, Antonio, Martin, Niemann, Stefan, Parkhill, Julian, Harris, Simon, Yeboah-Manu, Dorothy, Gagneux, Sebastien, Rigouts, Leen, Affolabi, Dissou, de Jong, Bouke C., Meehan, Conor J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Microbiology Society 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8477398/
https://www.ncbi.nlm.nih.gov/pubmed/34241588
http://dx.doi.org/10.1099/mgen.0.000437
_version_ 1784575834169802752
author Sanoussi, C. N'Dira
Coscolla, Mireia
Ofori-Anyinam, Boatema
Otchere, Isaac Darko
Antonio, Martin
Niemann, Stefan
Parkhill, Julian
Harris, Simon
Yeboah-Manu, Dorothy
Gagneux, Sebastien
Rigouts, Leen
Affolabi, Dissou
de Jong, Bouke C.
Meehan, Conor J.
author_facet Sanoussi, C. N'Dira
Coscolla, Mireia
Ofori-Anyinam, Boatema
Otchere, Isaac Darko
Antonio, Martin
Niemann, Stefan
Parkhill, Julian
Harris, Simon
Yeboah-Manu, Dorothy
Gagneux, Sebastien
Rigouts, Leen
Affolabi, Dissou
de Jong, Bouke C.
Meehan, Conor J.
author_sort Sanoussi, C. N'Dira
collection PubMed
description Pathogens of the Mycobacterium tuberculosis complex (MTBC) are considered to be monomorphic, with little gene content variation between strains. Nevertheless, several genotypic and phenotypic factors separate strains of the different MTBC lineages (L), especially L5 and L6 (traditionally termed Mycobacterium africanum) strains, from each other. However, this genome variability and gene content, especially of L5 strains, has not been fully explored and may be important for pathobiology and current approaches for genomic analysis of MTBC strains, including transmission studies. By comparing the genomes of 355 L5 clinical strains (including 3 complete genomes and 352 Illumina whole-genome sequenced isolates) to each other and to H37Rv, we identified multiple genes that were differentially present or absent between H37Rv and L5 strains. Additionally, considerable gene content variability was found across L5 strains, including a split in the L5.3 sub-lineage into L5.3.1 and L5.3.2. These gene content differences had a small knock-on effect on transmission cluster estimation, with clustering rates influenced by the selected reference genome, and with potential overestimation of recent transmission when using H37Rv as the reference genome. We conclude that full capture of the gene diversity, especially high-resolution outbreak analysis, requires a variation of the single H37Rv-centric reference genome mapping approach currently used in most whole-genome sequencing data analysis pipelines. Moreover, the high within-lineage gene content variability suggests that the pan-genome of M. tuberculosis is at least several kilobases larger than previously thought, implying that a concatenated or reference-free genome assembly (de novo) approach may be needed for particular questions.
format Online
Article
Text
id pubmed-8477398
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Microbiology Society
record_format MEDLINE/PubMed
spelling pubmed-84773982021-09-28 Mycobacterium tuberculosis complex lineage 5 exhibits high levels of within-lineage genomic diversity and differing gene content compared to the type strain H37Rv Sanoussi, C. N'Dira Coscolla, Mireia Ofori-Anyinam, Boatema Otchere, Isaac Darko Antonio, Martin Niemann, Stefan Parkhill, Julian Harris, Simon Yeboah-Manu, Dorothy Gagneux, Sebastien Rigouts, Leen Affolabi, Dissou de Jong, Bouke C. Meehan, Conor J. Microb Genom Research Articles Pathogens of the Mycobacterium tuberculosis complex (MTBC) are considered to be monomorphic, with little gene content variation between strains. Nevertheless, several genotypic and phenotypic factors separate strains of the different MTBC lineages (L), especially L5 and L6 (traditionally termed Mycobacterium africanum) strains, from each other. However, this genome variability and gene content, especially of L5 strains, has not been fully explored and may be important for pathobiology and current approaches for genomic analysis of MTBC strains, including transmission studies. By comparing the genomes of 355 L5 clinical strains (including 3 complete genomes and 352 Illumina whole-genome sequenced isolates) to each other and to H37Rv, we identified multiple genes that were differentially present or absent between H37Rv and L5 strains. Additionally, considerable gene content variability was found across L5 strains, including a split in the L5.3 sub-lineage into L5.3.1 and L5.3.2. These gene content differences had a small knock-on effect on transmission cluster estimation, with clustering rates influenced by the selected reference genome, and with potential overestimation of recent transmission when using H37Rv as the reference genome. We conclude that full capture of the gene diversity, especially high-resolution outbreak analysis, requires a variation of the single H37Rv-centric reference genome mapping approach currently used in most whole-genome sequencing data analysis pipelines. Moreover, the high within-lineage gene content variability suggests that the pan-genome of M. tuberculosis is at least several kilobases larger than previously thought, implying that a concatenated or reference-free genome assembly (de novo) approach may be needed for particular questions. Microbiology Society 2021-07-09 /pmc/articles/PMC8477398/ /pubmed/34241588 http://dx.doi.org/10.1099/mgen.0.000437 Text en © 2021 The Authors https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License.
spellingShingle Research Articles
Sanoussi, C. N'Dira
Coscolla, Mireia
Ofori-Anyinam, Boatema
Otchere, Isaac Darko
Antonio, Martin
Niemann, Stefan
Parkhill, Julian
Harris, Simon
Yeboah-Manu, Dorothy
Gagneux, Sebastien
Rigouts, Leen
Affolabi, Dissou
de Jong, Bouke C.
Meehan, Conor J.
Mycobacterium tuberculosis complex lineage 5 exhibits high levels of within-lineage genomic diversity and differing gene content compared to the type strain H37Rv
title Mycobacterium tuberculosis complex lineage 5 exhibits high levels of within-lineage genomic diversity and differing gene content compared to the type strain H37Rv
title_full Mycobacterium tuberculosis complex lineage 5 exhibits high levels of within-lineage genomic diversity and differing gene content compared to the type strain H37Rv
title_fullStr Mycobacterium tuberculosis complex lineage 5 exhibits high levels of within-lineage genomic diversity and differing gene content compared to the type strain H37Rv
title_full_unstemmed Mycobacterium tuberculosis complex lineage 5 exhibits high levels of within-lineage genomic diversity and differing gene content compared to the type strain H37Rv
title_short Mycobacterium tuberculosis complex lineage 5 exhibits high levels of within-lineage genomic diversity and differing gene content compared to the type strain H37Rv
title_sort mycobacterium tuberculosis complex lineage 5 exhibits high levels of within-lineage genomic diversity and differing gene content compared to the type strain h37rv
topic Research Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8477398/
https://www.ncbi.nlm.nih.gov/pubmed/34241588
http://dx.doi.org/10.1099/mgen.0.000437
work_keys_str_mv AT sanoussicndira mycobacteriumtuberculosiscomplexlineage5exhibitshighlevelsofwithinlineagegenomicdiversityanddifferinggenecontentcomparedtothetypestrainh37rv
AT coscollamireia mycobacteriumtuberculosiscomplexlineage5exhibitshighlevelsofwithinlineagegenomicdiversityanddifferinggenecontentcomparedtothetypestrainh37rv
AT oforianyinamboatema mycobacteriumtuberculosiscomplexlineage5exhibitshighlevelsofwithinlineagegenomicdiversityanddifferinggenecontentcomparedtothetypestrainh37rv
AT otchereisaacdarko mycobacteriumtuberculosiscomplexlineage5exhibitshighlevelsofwithinlineagegenomicdiversityanddifferinggenecontentcomparedtothetypestrainh37rv
AT antoniomartin mycobacteriumtuberculosiscomplexlineage5exhibitshighlevelsofwithinlineagegenomicdiversityanddifferinggenecontentcomparedtothetypestrainh37rv
AT niemannstefan mycobacteriumtuberculosiscomplexlineage5exhibitshighlevelsofwithinlineagegenomicdiversityanddifferinggenecontentcomparedtothetypestrainh37rv
AT parkhilljulian mycobacteriumtuberculosiscomplexlineage5exhibitshighlevelsofwithinlineagegenomicdiversityanddifferinggenecontentcomparedtothetypestrainh37rv
AT harrissimon mycobacteriumtuberculosiscomplexlineage5exhibitshighlevelsofwithinlineagegenomicdiversityanddifferinggenecontentcomparedtothetypestrainh37rv
AT yeboahmanudorothy mycobacteriumtuberculosiscomplexlineage5exhibitshighlevelsofwithinlineagegenomicdiversityanddifferinggenecontentcomparedtothetypestrainh37rv
AT gagneuxsebastien mycobacteriumtuberculosiscomplexlineage5exhibitshighlevelsofwithinlineagegenomicdiversityanddifferinggenecontentcomparedtothetypestrainh37rv
AT rigoutsleen mycobacteriumtuberculosiscomplexlineage5exhibitshighlevelsofwithinlineagegenomicdiversityanddifferinggenecontentcomparedtothetypestrainh37rv
AT affolabidissou mycobacteriumtuberculosiscomplexlineage5exhibitshighlevelsofwithinlineagegenomicdiversityanddifferinggenecontentcomparedtothetypestrainh37rv
AT dejongboukec mycobacteriumtuberculosiscomplexlineage5exhibitshighlevelsofwithinlineagegenomicdiversityanddifferinggenecontentcomparedtothetypestrainh37rv
AT meehanconorj mycobacteriumtuberculosiscomplexlineage5exhibitshighlevelsofwithinlineagegenomicdiversityanddifferinggenecontentcomparedtothetypestrainh37rv