Cargando…
A statistical method to identify recombination in bacterial genomes based on SNP incompatibility
BACKGROUND: Phylogeny estimation for bacteria is likely to reflect their true evolutionary histories only if they are highly clonal. However, recombination events could occur during evolution for some species. The reconstruction of phylogenetic trees from an alignment without considering recombinati...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6251179/ https://www.ncbi.nlm.nih.gov/pubmed/30466385 http://dx.doi.org/10.1186/s12859-018-2456-z |
_version_ | 1783373066188881920 |
---|---|
author | Lai, Yi-Pin Ioerger, Thomas R. |
author_facet | Lai, Yi-Pin Ioerger, Thomas R. |
author_sort | Lai, Yi-Pin |
collection | PubMed |
description | BACKGROUND: Phylogeny estimation for bacteria is likely to reflect their true evolutionary histories only if they are highly clonal. However, recombination events could occur during evolution for some species. The reconstruction of phylogenetic trees from an alignment without considering recombination could be misleading, since the relationships among strains in some parts of the genome might be different than in others. Using a single, global tree can create the appearance of homoplasy in recombined regions. Hence, the identification of recombination breakpoints is essential to better understand the evolutionary relationships of isolates among a bacterial population. RESULTS: Previously, we have developed a method (called ACR) to detect potential breakpoints in an alignment by evaluating compatibility of polymorphic sites in a sliding window. To assess the statistical significance of candidate breakpoints, we propose an extension of the algorithm (ptACR) that applies a permutation test to generate a null distribution for comparing the average local compatibility. The performance of ptACR is evaluated on both simulated and empirical datasets. ptACR is shown to have similar sensitivity (true positive rate) but a lower false positive rate and higher F1 score compared to basic ACR. When used to analyze a collection of clinical isolates of Staphylococcus aureus, ptACR finds clear evidence of recombination events in this bacterial pathogen, and is able to identify statistically significant boundaries of chromosomal regions with distinct phylogenies. CONCLUSIONS: ptACR is an accurate and efficient method for identifying genomic regions affected by recombination in bacterial genomes. |
format | Online Article Text |
id | pubmed-6251179 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-62511792018-11-29 A statistical method to identify recombination in bacterial genomes based on SNP incompatibility Lai, Yi-Pin Ioerger, Thomas R. BMC Bioinformatics Methodology Article BACKGROUND: Phylogeny estimation for bacteria is likely to reflect their true evolutionary histories only if they are highly clonal. However, recombination events could occur during evolution for some species. The reconstruction of phylogenetic trees from an alignment without considering recombination could be misleading, since the relationships among strains in some parts of the genome might be different than in others. Using a single, global tree can create the appearance of homoplasy in recombined regions. Hence, the identification of recombination breakpoints is essential to better understand the evolutionary relationships of isolates among a bacterial population. RESULTS: Previously, we have developed a method (called ACR) to detect potential breakpoints in an alignment by evaluating compatibility of polymorphic sites in a sliding window. To assess the statistical significance of candidate breakpoints, we propose an extension of the algorithm (ptACR) that applies a permutation test to generate a null distribution for comparing the average local compatibility. The performance of ptACR is evaluated on both simulated and empirical datasets. ptACR is shown to have similar sensitivity (true positive rate) but a lower false positive rate and higher F1 score compared to basic ACR. When used to analyze a collection of clinical isolates of Staphylococcus aureus, ptACR finds clear evidence of recombination events in this bacterial pathogen, and is able to identify statistically significant boundaries of chromosomal regions with distinct phylogenies. CONCLUSIONS: ptACR is an accurate and efficient method for identifying genomic regions affected by recombination in bacterial genomes. BioMed Central 2018-11-22 /pmc/articles/PMC6251179/ /pubmed/30466385 http://dx.doi.org/10.1186/s12859-018-2456-z Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Lai, Yi-Pin Ioerger, Thomas R. A statistical method to identify recombination in bacterial genomes based on SNP incompatibility |
title | A statistical method to identify recombination in bacterial genomes based on SNP incompatibility |
title_full | A statistical method to identify recombination in bacterial genomes based on SNP incompatibility |
title_fullStr | A statistical method to identify recombination in bacterial genomes based on SNP incompatibility |
title_full_unstemmed | A statistical method to identify recombination in bacterial genomes based on SNP incompatibility |
title_short | A statistical method to identify recombination in bacterial genomes based on SNP incompatibility |
title_sort | statistical method to identify recombination in bacterial genomes based on snp incompatibility |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6251179/ https://www.ncbi.nlm.nih.gov/pubmed/30466385 http://dx.doi.org/10.1186/s12859-018-2456-z |
work_keys_str_mv | AT laiyipin astatisticalmethodtoidentifyrecombinationinbacterialgenomesbasedonsnpincompatibility AT ioergerthomasr astatisticalmethodtoidentifyrecombinationinbacterialgenomesbasedonsnpincompatibility AT laiyipin statisticalmethodtoidentifyrecombinationinbacterialgenomesbasedonsnpincompatibility AT ioergerthomasr statisticalmethodtoidentifyrecombinationinbacterialgenomesbasedonsnpincompatibility |