Cargando…

Towards standardisation: comparison of five whole genome sequencing (WGS) analysis pipelines for detection of epidemiologically linked tuberculosis cases

BACKGROUND: Whole genome sequencing (WGS) is a reliable tool for studying tuberculosis (TB) transmission. WGS data are usually processed by custom-built analysis pipelines with little standardisation between them. AIM: To compare the impact of variability of several WGS analysis pipelines used inter...

Descripción completa

Detalles Bibliográficos
Autores principales: Jajou, Rana, Kohl, Thomas A, Walker, Timothy, Norman, Anders, Cirillo, Daniela Maria, Tagliani, Elisa, Niemann, Stefan, de Neeling, Albert, Lillebaek, Troels, Anthony, Richard M, van Soolingen, Dick
Formato: Online Artículo Texto
Lenguaje:English
Publicado: European Centre for Disease Prevention and Control (ECDC) 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6918587/
https://www.ncbi.nlm.nih.gov/pubmed/31847944
http://dx.doi.org/10.2807/1560-7917.ES.2019.24.50.1900130
_version_ 1783480620145442816
author Jajou, Rana
Kohl, Thomas A
Walker, Timothy
Norman, Anders
Cirillo, Daniela Maria
Tagliani, Elisa
Niemann, Stefan
de Neeling, Albert
Lillebaek, Troels
Anthony, Richard M
van Soolingen, Dick
author_facet Jajou, Rana
Kohl, Thomas A
Walker, Timothy
Norman, Anders
Cirillo, Daniela Maria
Tagliani, Elisa
Niemann, Stefan
de Neeling, Albert
Lillebaek, Troels
Anthony, Richard M
van Soolingen, Dick
author_sort Jajou, Rana
collection PubMed
description BACKGROUND: Whole genome sequencing (WGS) is a reliable tool for studying tuberculosis (TB) transmission. WGS data are usually processed by custom-built analysis pipelines with little standardisation between them. AIM: To compare the impact of variability of several WGS analysis pipelines used internationally to detect epidemiologically linked TB cases. METHODS: From the Netherlands, 535 Mycobacterium tuberculosis complex (MTBC) strains from 2016 were included. Epidemiological information obtained from municipal health services was available for all mycobacterial interspersed repeat unit-variable number of tandem repeat (MIRU-VNTR) clustered cases. WGS data was analysed using five different pipelines: one core genome multilocus sequence typing (cgMLST) approach and four single nucleotide polymorphism (SNP)-based pipelines developed in Oxford, United Kingdom; Borstel, Germany; Bilthoven, the Netherlands and Copenhagen, Denmark. WGS clusters were defined using a maximum pairwise distance of 12 SNPs/alleles. RESULTS: The cgMLST approach and Oxford pipeline clustered all epidemiologically linked cases, however, in the other three SNP-based pipelines one epidemiological link was missed due to insufficient coverage. In general, the genetic distances varied between pipelines, reflecting different clustering rates: the cgMLST approach clustered 92 cases, followed by 84, 83, 83 and 82 cases in the SNP-based pipelines from Copenhagen, Oxford, Borstel and Bilthoven respectively. CONCLUSION: Concordance in ruling out epidemiological links was high between pipelines, which is an important step in the international validation of WGS data analysis. To increase accuracy in identifying TB transmission clusters, standardisation of crucial WGS criteria and creation of a reference database of representative MTBC sequences would be advisable.
format Online
Article
Text
id pubmed-6918587
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher European Centre for Disease Prevention and Control (ECDC)
record_format MEDLINE/PubMed
spelling pubmed-69185872020-01-06 Towards standardisation: comparison of five whole genome sequencing (WGS) analysis pipelines for detection of epidemiologically linked tuberculosis cases Jajou, Rana Kohl, Thomas A Walker, Timothy Norman, Anders Cirillo, Daniela Maria Tagliani, Elisa Niemann, Stefan de Neeling, Albert Lillebaek, Troels Anthony, Richard M van Soolingen, Dick Euro Surveill Research BACKGROUND: Whole genome sequencing (WGS) is a reliable tool for studying tuberculosis (TB) transmission. WGS data are usually processed by custom-built analysis pipelines with little standardisation between them. AIM: To compare the impact of variability of several WGS analysis pipelines used internationally to detect epidemiologically linked TB cases. METHODS: From the Netherlands, 535 Mycobacterium tuberculosis complex (MTBC) strains from 2016 were included. Epidemiological information obtained from municipal health services was available for all mycobacterial interspersed repeat unit-variable number of tandem repeat (MIRU-VNTR) clustered cases. WGS data was analysed using five different pipelines: one core genome multilocus sequence typing (cgMLST) approach and four single nucleotide polymorphism (SNP)-based pipelines developed in Oxford, United Kingdom; Borstel, Germany; Bilthoven, the Netherlands and Copenhagen, Denmark. WGS clusters were defined using a maximum pairwise distance of 12 SNPs/alleles. RESULTS: The cgMLST approach and Oxford pipeline clustered all epidemiologically linked cases, however, in the other three SNP-based pipelines one epidemiological link was missed due to insufficient coverage. In general, the genetic distances varied between pipelines, reflecting different clustering rates: the cgMLST approach clustered 92 cases, followed by 84, 83, 83 and 82 cases in the SNP-based pipelines from Copenhagen, Oxford, Borstel and Bilthoven respectively. CONCLUSION: Concordance in ruling out epidemiological links was high between pipelines, which is an important step in the international validation of WGS data analysis. To increase accuracy in identifying TB transmission clusters, standardisation of crucial WGS criteria and creation of a reference database of representative MTBC sequences would be advisable. European Centre for Disease Prevention and Control (ECDC) 2019-12-12 /pmc/articles/PMC6918587/ /pubmed/31847944 http://dx.doi.org/10.2807/1560-7917.ES.2019.24.50.1900130 Text en This article is copyright of the authors or their affiliated institutions, 2019. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution (CC BY 4.0) Licence. You may share and adapt the material, but must give appropriate credit to the source, provide a link to the licence, and indicate if changes were made.
spellingShingle Research
Jajou, Rana
Kohl, Thomas A
Walker, Timothy
Norman, Anders
Cirillo, Daniela Maria
Tagliani, Elisa
Niemann, Stefan
de Neeling, Albert
Lillebaek, Troels
Anthony, Richard M
van Soolingen, Dick
Towards standardisation: comparison of five whole genome sequencing (WGS) analysis pipelines for detection of epidemiologically linked tuberculosis cases
title Towards standardisation: comparison of five whole genome sequencing (WGS) analysis pipelines for detection of epidemiologically linked tuberculosis cases
title_full Towards standardisation: comparison of five whole genome sequencing (WGS) analysis pipelines for detection of epidemiologically linked tuberculosis cases
title_fullStr Towards standardisation: comparison of five whole genome sequencing (WGS) analysis pipelines for detection of epidemiologically linked tuberculosis cases
title_full_unstemmed Towards standardisation: comparison of five whole genome sequencing (WGS) analysis pipelines for detection of epidemiologically linked tuberculosis cases
title_short Towards standardisation: comparison of five whole genome sequencing (WGS) analysis pipelines for detection of epidemiologically linked tuberculosis cases
title_sort towards standardisation: comparison of five whole genome sequencing (wgs) analysis pipelines for detection of epidemiologically linked tuberculosis cases
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6918587/
https://www.ncbi.nlm.nih.gov/pubmed/31847944
http://dx.doi.org/10.2807/1560-7917.ES.2019.24.50.1900130
work_keys_str_mv AT jajourana towardsstandardisationcomparisonoffivewholegenomesequencingwgsanalysispipelinesfordetectionofepidemiologicallylinkedtuberculosiscases
AT kohlthomasa towardsstandardisationcomparisonoffivewholegenomesequencingwgsanalysispipelinesfordetectionofepidemiologicallylinkedtuberculosiscases
AT walkertimothy towardsstandardisationcomparisonoffivewholegenomesequencingwgsanalysispipelinesfordetectionofepidemiologicallylinkedtuberculosiscases
AT normananders towardsstandardisationcomparisonoffivewholegenomesequencingwgsanalysispipelinesfordetectionofepidemiologicallylinkedtuberculosiscases
AT cirillodanielamaria towardsstandardisationcomparisonoffivewholegenomesequencingwgsanalysispipelinesfordetectionofepidemiologicallylinkedtuberculosiscases
AT taglianielisa towardsstandardisationcomparisonoffivewholegenomesequencingwgsanalysispipelinesfordetectionofepidemiologicallylinkedtuberculosiscases
AT niemannstefan towardsstandardisationcomparisonoffivewholegenomesequencingwgsanalysispipelinesfordetectionofepidemiologicallylinkedtuberculosiscases
AT deneelingalbert towardsstandardisationcomparisonoffivewholegenomesequencingwgsanalysispipelinesfordetectionofepidemiologicallylinkedtuberculosiscases
AT lillebaektroels towardsstandardisationcomparisonoffivewholegenomesequencingwgsanalysispipelinesfordetectionofepidemiologicallylinkedtuberculosiscases
AT anthonyrichardm towardsstandardisationcomparisonoffivewholegenomesequencingwgsanalysispipelinesfordetectionofepidemiologicallylinkedtuberculosiscases
AT vansoolingendick towardsstandardisationcomparisonoffivewholegenomesequencingwgsanalysispipelinesfordetectionofepidemiologicallylinkedtuberculosiscases