Cargando…

Genomic variant-identification methods may alter Mycobacterium tuberculosis transmission inferences

Pathogen genomic data are increasingly used to characterize global and local transmission patterns of important human pathogens and to inform public health interventions. Yet, there is no current consensus on how to measure genomic variation. To test the effect of the variant-identification approach...

Descripción completa

Detalles Bibliográficos
Autores principales: Walter, Katharine S., Colijn, Caroline, Cohen, Ted, Mathema, Barun, Liu, Qingyun, Bowers, Jolene, Engelthaler, David M., Narechania, Apurva, Lemmer, Darrin, Croda, Julio, Andrews, Jason R.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Microbiology Society 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7641424/
https://www.ncbi.nlm.nih.gov/pubmed/32735210
http://dx.doi.org/10.1099/mgen.0.000418
_version_ 1783605914936279040
author Walter, Katharine S.
Colijn, Caroline
Cohen, Ted
Mathema, Barun
Liu, Qingyun
Bowers, Jolene
Engelthaler, David M.
Narechania, Apurva
Lemmer, Darrin
Croda, Julio
Andrews, Jason R.
author_facet Walter, Katharine S.
Colijn, Caroline
Cohen, Ted
Mathema, Barun
Liu, Qingyun
Bowers, Jolene
Engelthaler, David M.
Narechania, Apurva
Lemmer, Darrin
Croda, Julio
Andrews, Jason R.
author_sort Walter, Katharine S.
collection PubMed
description Pathogen genomic data are increasingly used to characterize global and local transmission patterns of important human pathogens and to inform public health interventions. Yet, there is no current consensus on how to measure genomic variation. To test the effect of the variant-identification approach on transmission inferences for Mycobacterium tuberculosis, we conducted an experiment in which five genomic epidemiology groups applied variant-identification pipelines to the same outbreak sequence data. We compared the variants identified by each group in addition to transmission and phylogenetic inferences made with each variant set. To measure the performance of commonly used variant-identification tools, we simulated an outbreak. We compared the performance of three mapping algorithms, five variant callers and two variant filters in recovering true outbreak variants. Finally, we investigated the effect of applying increasingly stringent filters on transmission inferences and phylogenies. We found that variant-calling approaches used by different groups do not recover consistent sets of variants, which can lead to conflicting transmission inferences. Further, performance in recovering true variation varied widely across approaches. While no single variant-identification approach outperforms others in both recovering true genome-wide and outbreak-level variation, variant-identification algorithms calibrated upon real sequence data or that incorporate local reassembly outperform others in recovering true pairwise differences between isolates. The choice of variant filters contributed to extensive differences across pipelines, and applying increasingly stringent filters rapidly eroded the accuracy of transmission inferences and quality of phylogenies reconstructed from outbreak variation. Commonly used approaches to identify M. tuberculosis genomic variation have variable performance, particularly when predicting potential transmission links from pairwise genetic distances. Phylogenetic reconstruction may be improved by less stringent variant filtering. Approaches that improve variant identification in repetitive, hypervariable regions, such as long-read assemblies, may improve transmission inference.
format Online
Article
Text
id pubmed-7641424
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Microbiology Society
record_format MEDLINE/PubMed
spelling pubmed-76414242020-11-05 Genomic variant-identification methods may alter Mycobacterium tuberculosis transmission inferences Walter, Katharine S. Colijn, Caroline Cohen, Ted Mathema, Barun Liu, Qingyun Bowers, Jolene Engelthaler, David M. Narechania, Apurva Lemmer, Darrin Croda, Julio Andrews, Jason R. Microb Genom Research Article Pathogen genomic data are increasingly used to characterize global and local transmission patterns of important human pathogens and to inform public health interventions. Yet, there is no current consensus on how to measure genomic variation. To test the effect of the variant-identification approach on transmission inferences for Mycobacterium tuberculosis, we conducted an experiment in which five genomic epidemiology groups applied variant-identification pipelines to the same outbreak sequence data. We compared the variants identified by each group in addition to transmission and phylogenetic inferences made with each variant set. To measure the performance of commonly used variant-identification tools, we simulated an outbreak. We compared the performance of three mapping algorithms, five variant callers and two variant filters in recovering true outbreak variants. Finally, we investigated the effect of applying increasingly stringent filters on transmission inferences and phylogenies. We found that variant-calling approaches used by different groups do not recover consistent sets of variants, which can lead to conflicting transmission inferences. Further, performance in recovering true variation varied widely across approaches. While no single variant-identification approach outperforms others in both recovering true genome-wide and outbreak-level variation, variant-identification algorithms calibrated upon real sequence data or that incorporate local reassembly outperform others in recovering true pairwise differences between isolates. The choice of variant filters contributed to extensive differences across pipelines, and applying increasingly stringent filters rapidly eroded the accuracy of transmission inferences and quality of phylogenies reconstructed from outbreak variation. Commonly used approaches to identify M. tuberculosis genomic variation have variable performance, particularly when predicting potential transmission links from pairwise genetic distances. Phylogenetic reconstruction may be improved by less stringent variant filtering. Approaches that improve variant identification in repetitive, hypervariable regions, such as long-read assemblies, may improve transmission inference. Microbiology Society 2020-07-31 /pmc/articles/PMC7641424/ /pubmed/32735210 http://dx.doi.org/10.1099/mgen.0.000418 Text en © 2020 The Authors http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License.
spellingShingle Research Article
Walter, Katharine S.
Colijn, Caroline
Cohen, Ted
Mathema, Barun
Liu, Qingyun
Bowers, Jolene
Engelthaler, David M.
Narechania, Apurva
Lemmer, Darrin
Croda, Julio
Andrews, Jason R.
Genomic variant-identification methods may alter Mycobacterium tuberculosis transmission inferences
title Genomic variant-identification methods may alter Mycobacterium tuberculosis transmission inferences
title_full Genomic variant-identification methods may alter Mycobacterium tuberculosis transmission inferences
title_fullStr Genomic variant-identification methods may alter Mycobacterium tuberculosis transmission inferences
title_full_unstemmed Genomic variant-identification methods may alter Mycobacterium tuberculosis transmission inferences
title_short Genomic variant-identification methods may alter Mycobacterium tuberculosis transmission inferences
title_sort genomic variant-identification methods may alter mycobacterium tuberculosis transmission inferences
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7641424/
https://www.ncbi.nlm.nih.gov/pubmed/32735210
http://dx.doi.org/10.1099/mgen.0.000418
work_keys_str_mv AT walterkatharines genomicvariantidentificationmethodsmayaltermycobacteriumtuberculosistransmissioninferences
AT colijncaroline genomicvariantidentificationmethodsmayaltermycobacteriumtuberculosistransmissioninferences
AT cohented genomicvariantidentificationmethodsmayaltermycobacteriumtuberculosistransmissioninferences
AT mathemabarun genomicvariantidentificationmethodsmayaltermycobacteriumtuberculosistransmissioninferences
AT liuqingyun genomicvariantidentificationmethodsmayaltermycobacteriumtuberculosistransmissioninferences
AT bowersjolene genomicvariantidentificationmethodsmayaltermycobacteriumtuberculosistransmissioninferences
AT engelthalerdavidm genomicvariantidentificationmethodsmayaltermycobacteriumtuberculosistransmissioninferences
AT narechaniaapurva genomicvariantidentificationmethodsmayaltermycobacteriumtuberculosistransmissioninferences
AT lemmerdarrin genomicvariantidentificationmethodsmayaltermycobacteriumtuberculosistransmissioninferences
AT crodajulio genomicvariantidentificationmethodsmayaltermycobacteriumtuberculosistransmissioninferences
AT andrewsjasonr genomicvariantidentificationmethodsmayaltermycobacteriumtuberculosistransmissioninferences