Cargando…

SNPPar: identifying convergent evolution and other homoplasies from microbial whole-genome alignments

Homoplasic SNPs are considered important signatures of strong (positive) selective pressure, and hence of adaptive evolution for clinically relevant traits such as antibiotic resistance and virulence. Here we present a new tool, SNPPar, for efficient detection and analysis of homoplasic SNPs from la...

Descripción completa

Detalles Bibliográficos
Autores principales: Edwards, David J., Duchene, Sebastián, Pope, Bernard, Holt, Kathryn E.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Microbiology Society 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8767352/
https://www.ncbi.nlm.nih.gov/pubmed/34874243
http://dx.doi.org/10.1099/mgen.0.000694
_version_ 1784634719374147584
author Edwards, David J.
Duchene, Sebastián
Pope, Bernard
Holt, Kathryn E.
author_facet Edwards, David J.
Duchene, Sebastián
Pope, Bernard
Holt, Kathryn E.
author_sort Edwards, David J.
collection PubMed
description Homoplasic SNPs are considered important signatures of strong (positive) selective pressure, and hence of adaptive evolution for clinically relevant traits such as antibiotic resistance and virulence. Here we present a new tool, SNPPar, for efficient detection and analysis of homoplasic SNPs from large whole genome sequencing datasets (>1000 isolates and/or >100 000 SNPs). SNPPar takes as input an SNP alignment, tree and annotated reference genome, and uses a combination of simple monophyly tests and ancestral state reconstruction (ASR, via TreeTime) to assign mutation events to branches and identify homoplasies. Mutations are annotated at the level of codon and gene, to facilitate analysis of convergent evolution. Testing on simulated data (120 Mycobacterium tuberculosis alignments representing local and global samples) showed SNPPar can detect homoplasic SNPs with very high specificity (zero false-positives in all tests) and high sensitivity (zero false-negatives in 89 % of tests). SNPPar analysis of three empirically sampled datasets ( Elizabethkingia anophelis , Burkholderia dolosa and M. tuberculosis ) produced results that were in concordance with previous studies, in terms of both individual homoplasies and evidence of convergence at the codon and gene levels. SNPPar analysis of a simulated alignment of ~64 000 genome-wide SNPs from 2000 M. tuberculosis genomes took ~23 min and ~2.6 GB of RAM to generate complete annotated results on a laptop. This analysis required ASR be conducted for only 1.25 % of SNPs, and the ASR step took ~23 s and 0.4 GB of RAM. SNPPar automates the detection and annotation of homoplasic SNPs efficiently and accurately from large SNP alignments. As demonstrated by the examples included here, this information can be readily used to explore the role of homoplasy in parallel and/or convergent evolution at the level of nucleotide, codon and/or gene.
format Online
Article
Text
id pubmed-8767352
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Microbiology Society
record_format MEDLINE/PubMed
spelling pubmed-87673522022-01-19 SNPPar: identifying convergent evolution and other homoplasies from microbial whole-genome alignments Edwards, David J. Duchene, Sebastián Pope, Bernard Holt, Kathryn E. Microb Genom Research Articles Homoplasic SNPs are considered important signatures of strong (positive) selective pressure, and hence of adaptive evolution for clinically relevant traits such as antibiotic resistance and virulence. Here we present a new tool, SNPPar, for efficient detection and analysis of homoplasic SNPs from large whole genome sequencing datasets (>1000 isolates and/or >100 000 SNPs). SNPPar takes as input an SNP alignment, tree and annotated reference genome, and uses a combination of simple monophyly tests and ancestral state reconstruction (ASR, via TreeTime) to assign mutation events to branches and identify homoplasies. Mutations are annotated at the level of codon and gene, to facilitate analysis of convergent evolution. Testing on simulated data (120 Mycobacterium tuberculosis alignments representing local and global samples) showed SNPPar can detect homoplasic SNPs with very high specificity (zero false-positives in all tests) and high sensitivity (zero false-negatives in 89 % of tests). SNPPar analysis of three empirically sampled datasets ( Elizabethkingia anophelis , Burkholderia dolosa and M. tuberculosis ) produced results that were in concordance with previous studies, in terms of both individual homoplasies and evidence of convergence at the codon and gene levels. SNPPar analysis of a simulated alignment of ~64 000 genome-wide SNPs from 2000 M. tuberculosis genomes took ~23 min and ~2.6 GB of RAM to generate complete annotated results on a laptop. This analysis required ASR be conducted for only 1.25 % of SNPs, and the ASR step took ~23 s and 0.4 GB of RAM. SNPPar automates the detection and annotation of homoplasic SNPs efficiently and accurately from large SNP alignments. As demonstrated by the examples included here, this information can be readily used to explore the role of homoplasy in parallel and/or convergent evolution at the level of nucleotide, codon and/or gene. Microbiology Society 2021-12-07 /pmc/articles/PMC8767352/ /pubmed/34874243 http://dx.doi.org/10.1099/mgen.0.000694 Text en © 2021 The Authors https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License. This article was made open access via a Publish and Read agreement between the Microbiology Society and the corresponding author’s institution.
spellingShingle Research Articles
Edwards, David J.
Duchene, Sebastián
Pope, Bernard
Holt, Kathryn E.
SNPPar: identifying convergent evolution and other homoplasies from microbial whole-genome alignments
title SNPPar: identifying convergent evolution and other homoplasies from microbial whole-genome alignments
title_full SNPPar: identifying convergent evolution and other homoplasies from microbial whole-genome alignments
title_fullStr SNPPar: identifying convergent evolution and other homoplasies from microbial whole-genome alignments
title_full_unstemmed SNPPar: identifying convergent evolution and other homoplasies from microbial whole-genome alignments
title_short SNPPar: identifying convergent evolution and other homoplasies from microbial whole-genome alignments
title_sort snppar: identifying convergent evolution and other homoplasies from microbial whole-genome alignments
topic Research Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8767352/
https://www.ncbi.nlm.nih.gov/pubmed/34874243
http://dx.doi.org/10.1099/mgen.0.000694
work_keys_str_mv AT edwardsdavidj snpparidentifyingconvergentevolutionandotherhomoplasiesfrommicrobialwholegenomealignments
AT duchenesebastian snpparidentifyingconvergentevolutionandotherhomoplasiesfrommicrobialwholegenomealignments
AT popebernard snpparidentifyingconvergentevolutionandotherhomoplasiesfrommicrobialwholegenomealignments
AT holtkathryne snpparidentifyingconvergentevolutionandotherhomoplasiesfrommicrobialwholegenomealignments