Cargando…

Robust barcoding and identification of Mycobacterium tuberculosis lineages for epidemiological and clinical studies

BACKGROUND: Tuberculosis, caused by bacteria in the Mycobacterium tuberculosis complex (MTBC), is a major global public health burden. Strain-specific genomic diversity in the known lineages of MTBC is an important factor in pathogenesis that may affect virulence, transmissibility, host response and...

Descripción completa

Detalles Bibliográficos
Autores principales: Napier, Gary, Campino, Susana, Merid, Yared, Abebe, Markos, Woldeamanuel, Yimtubezinash, Aseffa, Abraham, Hibberd, Martin L., Phelan, Jody, Clark, Taane G.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7734807/
https://www.ncbi.nlm.nih.gov/pubmed/33317631
http://dx.doi.org/10.1186/s13073-020-00817-3
_version_ 1783622536403091456
author Napier, Gary
Campino, Susana
Merid, Yared
Abebe, Markos
Woldeamanuel, Yimtubezinash
Aseffa, Abraham
Hibberd, Martin L.
Phelan, Jody
Clark, Taane G.
author_facet Napier, Gary
Campino, Susana
Merid, Yared
Abebe, Markos
Woldeamanuel, Yimtubezinash
Aseffa, Abraham
Hibberd, Martin L.
Phelan, Jody
Clark, Taane G.
author_sort Napier, Gary
collection PubMed
description BACKGROUND: Tuberculosis, caused by bacteria in the Mycobacterium tuberculosis complex (MTBC), is a major global public health burden. Strain-specific genomic diversity in the known lineages of MTBC is an important factor in pathogenesis that may affect virulence, transmissibility, host response and emergence of drug resistance. Fast and accurate tracking of MTBC strains is therefore crucial for infection control, and our previous work developed a 62-single nucleotide polymorphism (SNP) barcode to inform on the phylogenetic identity of 7 human lineages and 64 sub-lineages. METHODS: To update this barcode, we analysed whole genome sequencing data from 35,298 MTBC isolates (~ 1 million SNPs) covering 9 main lineages and 3 similar animal-related species (M. tuberculosis var. bovis, M. tuberculosis var. caprae and M. tuberculosis var. orygis). The data was partitioned into training (N = 17,903, 50.7%) and test (N = 17,395, 49.3%) sets and were analysed using an integrated phylogenetic tree and population differentiation (F(ST)) statistical approach. RESULTS: By constructing a phylogenetic tree on the training MTBC isolates, we characterised 90 lineages or sub-lineages or species, of which 30 are new, and identified 421 robust barcoding mutations, of which a minimal set of 90 was selected that included 20 markers from the 62-SNP barcode. The barcoding SNPs (90 and 421) discriminated perfectly the 86 MTBC isolate (sub-)lineages in the test set and could accurately reconstruct the clades across the combined 35k samples. CONCLUSIONS: The validated 90 SNPs can be used for the rapid diagnosis and tracking of MTBC strains to assist public health surveillance and control. To facilitate this, the SNP markers have now been incorporated into the TB-Profiler informatics platform (https://github.com/jodyphelan/TBProfiler).
format Online
Article
Text
id pubmed-7734807
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-77348072020-12-15 Robust barcoding and identification of Mycobacterium tuberculosis lineages for epidemiological and clinical studies Napier, Gary Campino, Susana Merid, Yared Abebe, Markos Woldeamanuel, Yimtubezinash Aseffa, Abraham Hibberd, Martin L. Phelan, Jody Clark, Taane G. Genome Med Research BACKGROUND: Tuberculosis, caused by bacteria in the Mycobacterium tuberculosis complex (MTBC), is a major global public health burden. Strain-specific genomic diversity in the known lineages of MTBC is an important factor in pathogenesis that may affect virulence, transmissibility, host response and emergence of drug resistance. Fast and accurate tracking of MTBC strains is therefore crucial for infection control, and our previous work developed a 62-single nucleotide polymorphism (SNP) barcode to inform on the phylogenetic identity of 7 human lineages and 64 sub-lineages. METHODS: To update this barcode, we analysed whole genome sequencing data from 35,298 MTBC isolates (~ 1 million SNPs) covering 9 main lineages and 3 similar animal-related species (M. tuberculosis var. bovis, M. tuberculosis var. caprae and M. tuberculosis var. orygis). The data was partitioned into training (N = 17,903, 50.7%) and test (N = 17,395, 49.3%) sets and were analysed using an integrated phylogenetic tree and population differentiation (F(ST)) statistical approach. RESULTS: By constructing a phylogenetic tree on the training MTBC isolates, we characterised 90 lineages or sub-lineages or species, of which 30 are new, and identified 421 robust barcoding mutations, of which a minimal set of 90 was selected that included 20 markers from the 62-SNP barcode. The barcoding SNPs (90 and 421) discriminated perfectly the 86 MTBC isolate (sub-)lineages in the test set and could accurately reconstruct the clades across the combined 35k samples. CONCLUSIONS: The validated 90 SNPs can be used for the rapid diagnosis and tracking of MTBC strains to assist public health surveillance and control. To facilitate this, the SNP markers have now been incorporated into the TB-Profiler informatics platform (https://github.com/jodyphelan/TBProfiler). BioMed Central 2020-12-14 /pmc/articles/PMC7734807/ /pubmed/33317631 http://dx.doi.org/10.1186/s13073-020-00817-3 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Napier, Gary
Campino, Susana
Merid, Yared
Abebe, Markos
Woldeamanuel, Yimtubezinash
Aseffa, Abraham
Hibberd, Martin L.
Phelan, Jody
Clark, Taane G.
Robust barcoding and identification of Mycobacterium tuberculosis lineages for epidemiological and clinical studies
title Robust barcoding and identification of Mycobacterium tuberculosis lineages for epidemiological and clinical studies
title_full Robust barcoding and identification of Mycobacterium tuberculosis lineages for epidemiological and clinical studies
title_fullStr Robust barcoding and identification of Mycobacterium tuberculosis lineages for epidemiological and clinical studies
title_full_unstemmed Robust barcoding and identification of Mycobacterium tuberculosis lineages for epidemiological and clinical studies
title_short Robust barcoding and identification of Mycobacterium tuberculosis lineages for epidemiological and clinical studies
title_sort robust barcoding and identification of mycobacterium tuberculosis lineages for epidemiological and clinical studies
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7734807/
https://www.ncbi.nlm.nih.gov/pubmed/33317631
http://dx.doi.org/10.1186/s13073-020-00817-3
work_keys_str_mv AT napiergary robustbarcodingandidentificationofmycobacteriumtuberculosislineagesforepidemiologicalandclinicalstudies
AT campinosusana robustbarcodingandidentificationofmycobacteriumtuberculosislineagesforepidemiologicalandclinicalstudies
AT meridyared robustbarcodingandidentificationofmycobacteriumtuberculosislineagesforepidemiologicalandclinicalstudies
AT abebemarkos robustbarcodingandidentificationofmycobacteriumtuberculosislineagesforepidemiologicalandclinicalstudies
AT woldeamanuelyimtubezinash robustbarcodingandidentificationofmycobacteriumtuberculosislineagesforepidemiologicalandclinicalstudies
AT aseffaabraham robustbarcodingandidentificationofmycobacteriumtuberculosislineagesforepidemiologicalandclinicalstudies
AT hibberdmartinl robustbarcodingandidentificationofmycobacteriumtuberculosislineagesforepidemiologicalandclinicalstudies
AT phelanjody robustbarcodingandidentificationofmycobacteriumtuberculosislineagesforepidemiologicalandclinicalstudies
AT clarktaaneg robustbarcodingandidentificationofmycobacteriumtuberculosislineagesforepidemiologicalandclinicalstudies