Cargando…

A coarse-graining, ultrametric approach to resolve the phylogeny of prokaryotic strains with frequent homologous recombination

BACKGROUND: A frequent event in the evolution of prokaryotic genomes is homologous recombination, where a foreign DNA stretch replaces a genomic region similar in sequence. Recombination can affect the relative position of two genomes in a phylogenetic reconstruction in two different ways: (i) one g...

Descripción completa

Detalles Bibliográficos
Autor principal: Pang, Tin Yau
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7204016/
https://www.ncbi.nlm.nih.gov/pubmed/32381044
http://dx.doi.org/10.1186/s12862-020-01616-5
_version_ 1783529977869762560
author Pang, Tin Yau
author_facet Pang, Tin Yau
author_sort Pang, Tin Yau
collection PubMed
description BACKGROUND: A frequent event in the evolution of prokaryotic genomes is homologous recombination, where a foreign DNA stretch replaces a genomic region similar in sequence. Recombination can affect the relative position of two genomes in a phylogenetic reconstruction in two different ways: (i) one genome can recombine with a DNA stretch that is similar to the other genome, thereby reducing their pairwise sequence divergence; (ii) one genome can recombine with a DNA stretch from an outgroup genome, increasing the pairwise divergence. While several recombination-aware phylogenetic algorithms exist, many of these cannot account for both types of recombination; some algorithms can, but do so inefficiently. Moreover, many of them reconstruct the ancestral recombination graph (ARG) to help infer the genome tree, and require that a substantial portion of each genome has not been affected by recombination, a sometimes unrealistic assumption. METHODS: Here, we propose a Coarse-Graining approach for Phylogenetic reconstruction (CGP), which is recombination-aware but forgoes ARG reconstruction. It accounts for the tendency of a higher effective recombination rate between genomes with a lower phylogenetic distance. It is applicable even if all genomic regions have experienced substantial amounts of recombination, and can be used on both nucleotide and amino acid sequences. CGP considers the local density of substitutions along pairwise genome alignments, fitting a model to the empirical distribution of substitution density to infer the pairwise coalescent time. Given all pairwise coalescent times, CGP reconstructs an ultrametric tree representing vertical inheritance. RESULTS: Based on simulations, we show that the proposed approach can reconstruct ultrametric trees with accurate topology, branch lengths, and root positioning. Applied to a set of E. coli strains, the reconstructed trees are most consistent with gene distributions when inferred from amino acid sequences, a data type that cannot be utilized by many alternative approaches. CONCLUSIONS: The CGP algorithm is more accurate than alternative recombination-aware methods for ultrametric phylogenetic reconstructions.
format Online
Article
Text
id pubmed-7204016
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-72040162020-05-12 A coarse-graining, ultrametric approach to resolve the phylogeny of prokaryotic strains with frequent homologous recombination Pang, Tin Yau BMC Evol Biol Research Article BACKGROUND: A frequent event in the evolution of prokaryotic genomes is homologous recombination, where a foreign DNA stretch replaces a genomic region similar in sequence. Recombination can affect the relative position of two genomes in a phylogenetic reconstruction in two different ways: (i) one genome can recombine with a DNA stretch that is similar to the other genome, thereby reducing their pairwise sequence divergence; (ii) one genome can recombine with a DNA stretch from an outgroup genome, increasing the pairwise divergence. While several recombination-aware phylogenetic algorithms exist, many of these cannot account for both types of recombination; some algorithms can, but do so inefficiently. Moreover, many of them reconstruct the ancestral recombination graph (ARG) to help infer the genome tree, and require that a substantial portion of each genome has not been affected by recombination, a sometimes unrealistic assumption. METHODS: Here, we propose a Coarse-Graining approach for Phylogenetic reconstruction (CGP), which is recombination-aware but forgoes ARG reconstruction. It accounts for the tendency of a higher effective recombination rate between genomes with a lower phylogenetic distance. It is applicable even if all genomic regions have experienced substantial amounts of recombination, and can be used on both nucleotide and amino acid sequences. CGP considers the local density of substitutions along pairwise genome alignments, fitting a model to the empirical distribution of substitution density to infer the pairwise coalescent time. Given all pairwise coalescent times, CGP reconstructs an ultrametric tree representing vertical inheritance. RESULTS: Based on simulations, we show that the proposed approach can reconstruct ultrametric trees with accurate topology, branch lengths, and root positioning. Applied to a set of E. coli strains, the reconstructed trees are most consistent with gene distributions when inferred from amino acid sequences, a data type that cannot be utilized by many alternative approaches. CONCLUSIONS: The CGP algorithm is more accurate than alternative recombination-aware methods for ultrametric phylogenetic reconstructions. BioMed Central 2020-05-07 /pmc/articles/PMC7204016/ /pubmed/32381044 http://dx.doi.org/10.1186/s12862-020-01616-5 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research Article
Pang, Tin Yau
A coarse-graining, ultrametric approach to resolve the phylogeny of prokaryotic strains with frequent homologous recombination
title A coarse-graining, ultrametric approach to resolve the phylogeny of prokaryotic strains with frequent homologous recombination
title_full A coarse-graining, ultrametric approach to resolve the phylogeny of prokaryotic strains with frequent homologous recombination
title_fullStr A coarse-graining, ultrametric approach to resolve the phylogeny of prokaryotic strains with frequent homologous recombination
title_full_unstemmed A coarse-graining, ultrametric approach to resolve the phylogeny of prokaryotic strains with frequent homologous recombination
title_short A coarse-graining, ultrametric approach to resolve the phylogeny of prokaryotic strains with frequent homologous recombination
title_sort coarse-graining, ultrametric approach to resolve the phylogeny of prokaryotic strains with frequent homologous recombination
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7204016/
https://www.ncbi.nlm.nih.gov/pubmed/32381044
http://dx.doi.org/10.1186/s12862-020-01616-5
work_keys_str_mv AT pangtinyau acoarsegrainingultrametricapproachtoresolvethephylogenyofprokaryoticstrainswithfrequenthomologousrecombination
AT pangtinyau coarsegrainingultrametricapproachtoresolvethephylogenyofprokaryoticstrainswithfrequenthomologousrecombination