Cargando…

Comparing genomic variant identification protocols for Candida auris

Genomic analyses are widely applied to epidemiological, population genetic and experimental studies of pathogenic fungi. A wide range of methods are employed to carry out these analyses, typically without including controls that gauge the accuracy of variant prediction. The importance of tracking ou...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Xiao, Muñoz, José F., Gade, Lalitha, Argimon, Silvia, Bougnoux, Marie-Elisabeth, Bowers, Jolene R., Chow, Nancy A., Cuesta, Isabel, Farrer, Rhys A., Maufrais, Corinne, Monroy-Nieto, Juan, Pradhan, Dibyabhaba, Uehling, Jessie, Vu, Duong, Yeats, Corin A., Aanensen, David M., d’Enfert, Christophe, Engelthaler, David M., Eyre, David W., Fisher, Matthew C., Hagen, Ferry, Meyer, Wieland, Singh, Gagandeep, Alastruey-Izquierdo, Ana, Litvintseva, Anastasia P., Cuomo, Christina A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Microbiology Society 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10210944/
https://www.ncbi.nlm.nih.gov/pubmed/37043380
http://dx.doi.org/10.1099/mgen.0.000979
_version_ 1785047177592045568
author Li, Xiao
Muñoz, José F.
Gade, Lalitha
Argimon, Silvia
Bougnoux, Marie-Elisabeth
Bowers, Jolene R.
Chow, Nancy A.
Cuesta, Isabel
Farrer, Rhys A.
Maufrais, Corinne
Monroy-Nieto, Juan
Pradhan, Dibyabhaba
Uehling, Jessie
Vu, Duong
Yeats, Corin A.
Aanensen, David M.
d’Enfert, Christophe
Engelthaler, David M.
Eyre, David W.
Fisher, Matthew C.
Hagen, Ferry
Meyer, Wieland
Singh, Gagandeep
Alastruey-Izquierdo, Ana
Litvintseva, Anastasia P.
Cuomo, Christina A.
author_facet Li, Xiao
Muñoz, José F.
Gade, Lalitha
Argimon, Silvia
Bougnoux, Marie-Elisabeth
Bowers, Jolene R.
Chow, Nancy A.
Cuesta, Isabel
Farrer, Rhys A.
Maufrais, Corinne
Monroy-Nieto, Juan
Pradhan, Dibyabhaba
Uehling, Jessie
Vu, Duong
Yeats, Corin A.
Aanensen, David M.
d’Enfert, Christophe
Engelthaler, David M.
Eyre, David W.
Fisher, Matthew C.
Hagen, Ferry
Meyer, Wieland
Singh, Gagandeep
Alastruey-Izquierdo, Ana
Litvintseva, Anastasia P.
Cuomo, Christina A.
author_sort Li, Xiao
collection PubMed
description Genomic analyses are widely applied to epidemiological, population genetic and experimental studies of pathogenic fungi. A wide range of methods are employed to carry out these analyses, typically without including controls that gauge the accuracy of variant prediction. The importance of tracking outbreaks at a global scale has raised the urgency of establishing high-accuracy pipelines that generate consistent results between research groups. To evaluate currently employed methods for whole-genome variant detection and elaborate best practices for fungal pathogens, we compared how 14 independent variant calling pipelines performed across 35 Candida auris isolates from 4 distinct clades and evaluated the performance of variant calling, single-nucleotide polymorphism (SNP) counts and phylogenetic inference results. Although these pipelines used different variant callers and filtering criteria, we found high overall agreement of SNPs from each pipeline. This concordance correlated with site quality, as SNPs discovered by a few pipelines tended to show lower mapping quality scores and depth of coverage than those recovered by all pipelines. We observed that the major differences between pipelines were due to variation in read trimming strategies, SNP calling methods and parameters, and downstream filtration criteria. We calculated specificity and sensitivity for each pipeline by aligning three isolates with chromosomal level assemblies and found that the GATK-based pipelines were well balanced between these metrics. Selection of trimming methods had a greater impact on SAMtools-based pipelines than those using GATK. Phylogenetic trees inferred by each pipeline showed high consistency at the clade level, but there was more variability between isolates from a single outbreak, with pipelines that used more stringent cutoffs having lower resolution. This project generated two truth datasets useful for routine benchmarking of C. auris variant calling, a consensus VCF of genotypes discovered by 10 or more pipelines across these 35 diverse isolates and variants for 2 samples identified from whole-genome alignments. This study provides a foundation for evaluating SNP calling pipelines and developing best practices for future fungal genomic studies.
format Online
Article
Text
id pubmed-10210944
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Microbiology Society
record_format MEDLINE/PubMed
spelling pubmed-102109442023-05-26 Comparing genomic variant identification protocols for Candida auris Li, Xiao Muñoz, José F. Gade, Lalitha Argimon, Silvia Bougnoux, Marie-Elisabeth Bowers, Jolene R. Chow, Nancy A. Cuesta, Isabel Farrer, Rhys A. Maufrais, Corinne Monroy-Nieto, Juan Pradhan, Dibyabhaba Uehling, Jessie Vu, Duong Yeats, Corin A. Aanensen, David M. d’Enfert, Christophe Engelthaler, David M. Eyre, David W. Fisher, Matthew C. Hagen, Ferry Meyer, Wieland Singh, Gagandeep Alastruey-Izquierdo, Ana Litvintseva, Anastasia P. Cuomo, Christina A. Microb Genom Research Articles Genomic analyses are widely applied to epidemiological, population genetic and experimental studies of pathogenic fungi. A wide range of methods are employed to carry out these analyses, typically without including controls that gauge the accuracy of variant prediction. The importance of tracking outbreaks at a global scale has raised the urgency of establishing high-accuracy pipelines that generate consistent results between research groups. To evaluate currently employed methods for whole-genome variant detection and elaborate best practices for fungal pathogens, we compared how 14 independent variant calling pipelines performed across 35 Candida auris isolates from 4 distinct clades and evaluated the performance of variant calling, single-nucleotide polymorphism (SNP) counts and phylogenetic inference results. Although these pipelines used different variant callers and filtering criteria, we found high overall agreement of SNPs from each pipeline. This concordance correlated with site quality, as SNPs discovered by a few pipelines tended to show lower mapping quality scores and depth of coverage than those recovered by all pipelines. We observed that the major differences between pipelines were due to variation in read trimming strategies, SNP calling methods and parameters, and downstream filtration criteria. We calculated specificity and sensitivity for each pipeline by aligning three isolates with chromosomal level assemblies and found that the GATK-based pipelines were well balanced between these metrics. Selection of trimming methods had a greater impact on SAMtools-based pipelines than those using GATK. Phylogenetic trees inferred by each pipeline showed high consistency at the clade level, but there was more variability between isolates from a single outbreak, with pipelines that used more stringent cutoffs having lower resolution. This project generated two truth datasets useful for routine benchmarking of C. auris variant calling, a consensus VCF of genotypes discovered by 10 or more pipelines across these 35 diverse isolates and variants for 2 samples identified from whole-genome alignments. This study provides a foundation for evaluating SNP calling pipelines and developing best practices for future fungal genomic studies. Microbiology Society 2023-04-12 /pmc/articles/PMC10210944/ /pubmed/37043380 http://dx.doi.org/10.1099/mgen.0.000979 Text en © 2023 The Authors https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License.
spellingShingle Research Articles
Li, Xiao
Muñoz, José F.
Gade, Lalitha
Argimon, Silvia
Bougnoux, Marie-Elisabeth
Bowers, Jolene R.
Chow, Nancy A.
Cuesta, Isabel
Farrer, Rhys A.
Maufrais, Corinne
Monroy-Nieto, Juan
Pradhan, Dibyabhaba
Uehling, Jessie
Vu, Duong
Yeats, Corin A.
Aanensen, David M.
d’Enfert, Christophe
Engelthaler, David M.
Eyre, David W.
Fisher, Matthew C.
Hagen, Ferry
Meyer, Wieland
Singh, Gagandeep
Alastruey-Izquierdo, Ana
Litvintseva, Anastasia P.
Cuomo, Christina A.
Comparing genomic variant identification protocols for Candida auris
title Comparing genomic variant identification protocols for Candida auris
title_full Comparing genomic variant identification protocols for Candida auris
title_fullStr Comparing genomic variant identification protocols for Candida auris
title_full_unstemmed Comparing genomic variant identification protocols for Candida auris
title_short Comparing genomic variant identification protocols for Candida auris
title_sort comparing genomic variant identification protocols for candida auris
topic Research Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10210944/
https://www.ncbi.nlm.nih.gov/pubmed/37043380
http://dx.doi.org/10.1099/mgen.0.000979
work_keys_str_mv AT lixiao comparinggenomicvariantidentificationprotocolsforcandidaauris
AT munozjosef comparinggenomicvariantidentificationprotocolsforcandidaauris
AT gadelalitha comparinggenomicvariantidentificationprotocolsforcandidaauris
AT argimonsilvia comparinggenomicvariantidentificationprotocolsforcandidaauris
AT bougnouxmarieelisabeth comparinggenomicvariantidentificationprotocolsforcandidaauris
AT bowersjolener comparinggenomicvariantidentificationprotocolsforcandidaauris
AT chownancya comparinggenomicvariantidentificationprotocolsforcandidaauris
AT cuestaisabel comparinggenomicvariantidentificationprotocolsforcandidaauris
AT farrerrhysa comparinggenomicvariantidentificationprotocolsforcandidaauris
AT maufraiscorinne comparinggenomicvariantidentificationprotocolsforcandidaauris
AT monroynietojuan comparinggenomicvariantidentificationprotocolsforcandidaauris
AT pradhandibyabhaba comparinggenomicvariantidentificationprotocolsforcandidaauris
AT uehlingjessie comparinggenomicvariantidentificationprotocolsforcandidaauris
AT vuduong comparinggenomicvariantidentificationprotocolsforcandidaauris
AT yeatscorina comparinggenomicvariantidentificationprotocolsforcandidaauris
AT aanensendavidm comparinggenomicvariantidentificationprotocolsforcandidaauris
AT denfertchristophe comparinggenomicvariantidentificationprotocolsforcandidaauris
AT engelthalerdavidm comparinggenomicvariantidentificationprotocolsforcandidaauris
AT eyredavidw comparinggenomicvariantidentificationprotocolsforcandidaauris
AT fishermatthewc comparinggenomicvariantidentificationprotocolsforcandidaauris
AT hagenferry comparinggenomicvariantidentificationprotocolsforcandidaauris
AT meyerwieland comparinggenomicvariantidentificationprotocolsforcandidaauris
AT singhgagandeep comparinggenomicvariantidentificationprotocolsforcandidaauris
AT alastrueyizquierdoana comparinggenomicvariantidentificationprotocolsforcandidaauris
AT litvintsevaanastasiap comparinggenomicvariantidentificationprotocolsforcandidaauris
AT cuomochristinaa comparinggenomicvariantidentificationprotocolsforcandidaauris