Cargando…
Towards standardisation: comparison of five whole genome sequencing (WGS) analysis pipelines for detection of epidemiologically linked tuberculosis cases
BACKGROUND: Whole genome sequencing (WGS) is a reliable tool for studying tuberculosis (TB) transmission. WGS data are usually processed by custom-built analysis pipelines with little standardisation between them. AIM: To compare the impact of variability of several WGS analysis pipelines used inter...
Autores principales: | , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
European Centre for Disease Prevention and Control (ECDC)
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6918587/ https://www.ncbi.nlm.nih.gov/pubmed/31847944 http://dx.doi.org/10.2807/1560-7917.ES.2019.24.50.1900130 |
_version_ | 1783480620145442816 |
---|---|
author | Jajou, Rana Kohl, Thomas A Walker, Timothy Norman, Anders Cirillo, Daniela Maria Tagliani, Elisa Niemann, Stefan de Neeling, Albert Lillebaek, Troels Anthony, Richard M van Soolingen, Dick |
author_facet | Jajou, Rana Kohl, Thomas A Walker, Timothy Norman, Anders Cirillo, Daniela Maria Tagliani, Elisa Niemann, Stefan de Neeling, Albert Lillebaek, Troels Anthony, Richard M van Soolingen, Dick |
author_sort | Jajou, Rana |
collection | PubMed |
description | BACKGROUND: Whole genome sequencing (WGS) is a reliable tool for studying tuberculosis (TB) transmission. WGS data are usually processed by custom-built analysis pipelines with little standardisation between them. AIM: To compare the impact of variability of several WGS analysis pipelines used internationally to detect epidemiologically linked TB cases. METHODS: From the Netherlands, 535 Mycobacterium tuberculosis complex (MTBC) strains from 2016 were included. Epidemiological information obtained from municipal health services was available for all mycobacterial interspersed repeat unit-variable number of tandem repeat (MIRU-VNTR) clustered cases. WGS data was analysed using five different pipelines: one core genome multilocus sequence typing (cgMLST) approach and four single nucleotide polymorphism (SNP)-based pipelines developed in Oxford, United Kingdom; Borstel, Germany; Bilthoven, the Netherlands and Copenhagen, Denmark. WGS clusters were defined using a maximum pairwise distance of 12 SNPs/alleles. RESULTS: The cgMLST approach and Oxford pipeline clustered all epidemiologically linked cases, however, in the other three SNP-based pipelines one epidemiological link was missed due to insufficient coverage. In general, the genetic distances varied between pipelines, reflecting different clustering rates: the cgMLST approach clustered 92 cases, followed by 84, 83, 83 and 82 cases in the SNP-based pipelines from Copenhagen, Oxford, Borstel and Bilthoven respectively. CONCLUSION: Concordance in ruling out epidemiological links was high between pipelines, which is an important step in the international validation of WGS data analysis. To increase accuracy in identifying TB transmission clusters, standardisation of crucial WGS criteria and creation of a reference database of representative MTBC sequences would be advisable. |
format | Online Article Text |
id | pubmed-6918587 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | European Centre for Disease Prevention and Control (ECDC) |
record_format | MEDLINE/PubMed |
spelling | pubmed-69185872020-01-06 Towards standardisation: comparison of five whole genome sequencing (WGS) analysis pipelines for detection of epidemiologically linked tuberculosis cases Jajou, Rana Kohl, Thomas A Walker, Timothy Norman, Anders Cirillo, Daniela Maria Tagliani, Elisa Niemann, Stefan de Neeling, Albert Lillebaek, Troels Anthony, Richard M van Soolingen, Dick Euro Surveill Research BACKGROUND: Whole genome sequencing (WGS) is a reliable tool for studying tuberculosis (TB) transmission. WGS data are usually processed by custom-built analysis pipelines with little standardisation between them. AIM: To compare the impact of variability of several WGS analysis pipelines used internationally to detect epidemiologically linked TB cases. METHODS: From the Netherlands, 535 Mycobacterium tuberculosis complex (MTBC) strains from 2016 were included. Epidemiological information obtained from municipal health services was available for all mycobacterial interspersed repeat unit-variable number of tandem repeat (MIRU-VNTR) clustered cases. WGS data was analysed using five different pipelines: one core genome multilocus sequence typing (cgMLST) approach and four single nucleotide polymorphism (SNP)-based pipelines developed in Oxford, United Kingdom; Borstel, Germany; Bilthoven, the Netherlands and Copenhagen, Denmark. WGS clusters were defined using a maximum pairwise distance of 12 SNPs/alleles. RESULTS: The cgMLST approach and Oxford pipeline clustered all epidemiologically linked cases, however, in the other three SNP-based pipelines one epidemiological link was missed due to insufficient coverage. In general, the genetic distances varied between pipelines, reflecting different clustering rates: the cgMLST approach clustered 92 cases, followed by 84, 83, 83 and 82 cases in the SNP-based pipelines from Copenhagen, Oxford, Borstel and Bilthoven respectively. CONCLUSION: Concordance in ruling out epidemiological links was high between pipelines, which is an important step in the international validation of WGS data analysis. To increase accuracy in identifying TB transmission clusters, standardisation of crucial WGS criteria and creation of a reference database of representative MTBC sequences would be advisable. European Centre for Disease Prevention and Control (ECDC) 2019-12-12 /pmc/articles/PMC6918587/ /pubmed/31847944 http://dx.doi.org/10.2807/1560-7917.ES.2019.24.50.1900130 Text en This article is copyright of the authors or their affiliated institutions, 2019. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution (CC BY 4.0) Licence. You may share and adapt the material, but must give appropriate credit to the source, provide a link to the licence, and indicate if changes were made. |
spellingShingle | Research Jajou, Rana Kohl, Thomas A Walker, Timothy Norman, Anders Cirillo, Daniela Maria Tagliani, Elisa Niemann, Stefan de Neeling, Albert Lillebaek, Troels Anthony, Richard M van Soolingen, Dick Towards standardisation: comparison of five whole genome sequencing (WGS) analysis pipelines for detection of epidemiologically linked tuberculosis cases |
title | Towards standardisation: comparison of five whole genome sequencing (WGS) analysis pipelines for detection of epidemiologically linked tuberculosis cases |
title_full | Towards standardisation: comparison of five whole genome sequencing (WGS) analysis pipelines for detection of epidemiologically linked tuberculosis cases |
title_fullStr | Towards standardisation: comparison of five whole genome sequencing (WGS) analysis pipelines for detection of epidemiologically linked tuberculosis cases |
title_full_unstemmed | Towards standardisation: comparison of five whole genome sequencing (WGS) analysis pipelines for detection of epidemiologically linked tuberculosis cases |
title_short | Towards standardisation: comparison of five whole genome sequencing (WGS) analysis pipelines for detection of epidemiologically linked tuberculosis cases |
title_sort | towards standardisation: comparison of five whole genome sequencing (wgs) analysis pipelines for detection of epidemiologically linked tuberculosis cases |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6918587/ https://www.ncbi.nlm.nih.gov/pubmed/31847944 http://dx.doi.org/10.2807/1560-7917.ES.2019.24.50.1900130 |
work_keys_str_mv | AT jajourana towardsstandardisationcomparisonoffivewholegenomesequencingwgsanalysispipelinesfordetectionofepidemiologicallylinkedtuberculosiscases AT kohlthomasa towardsstandardisationcomparisonoffivewholegenomesequencingwgsanalysispipelinesfordetectionofepidemiologicallylinkedtuberculosiscases AT walkertimothy towardsstandardisationcomparisonoffivewholegenomesequencingwgsanalysispipelinesfordetectionofepidemiologicallylinkedtuberculosiscases AT normananders towardsstandardisationcomparisonoffivewholegenomesequencingwgsanalysispipelinesfordetectionofepidemiologicallylinkedtuberculosiscases AT cirillodanielamaria towardsstandardisationcomparisonoffivewholegenomesequencingwgsanalysispipelinesfordetectionofepidemiologicallylinkedtuberculosiscases AT taglianielisa towardsstandardisationcomparisonoffivewholegenomesequencingwgsanalysispipelinesfordetectionofepidemiologicallylinkedtuberculosiscases AT niemannstefan towardsstandardisationcomparisonoffivewholegenomesequencingwgsanalysispipelinesfordetectionofepidemiologicallylinkedtuberculosiscases AT deneelingalbert towardsstandardisationcomparisonoffivewholegenomesequencingwgsanalysispipelinesfordetectionofepidemiologicallylinkedtuberculosiscases AT lillebaektroels towardsstandardisationcomparisonoffivewholegenomesequencingwgsanalysispipelinesfordetectionofepidemiologicallylinkedtuberculosiscases AT anthonyrichardm towardsstandardisationcomparisonoffivewholegenomesequencingwgsanalysispipelinesfordetectionofepidemiologicallylinkedtuberculosiscases AT vansoolingendick towardsstandardisationcomparisonoffivewholegenomesequencingwgsanalysispipelinesfordetectionofepidemiologicallylinkedtuberculosiscases |