Cargando…

Efficient inference of bacterial strain trees from genome-scale multilocus data

Motivation: In bacterial evolution, inferring a strain tree, which is the evolutionary history of different strains of the same bacterium, plays a major role in analyzing and understanding the evolution of strongly isolated populations, population divergence and various evolutionary events, such as...

Descripción completa

Detalles Bibliográficos
Autores principales:	Than, C., Sugino, R., Innan, H., Nakhleh, L.
Formato:	Texto
Lenguaje:	English
Publicado:	Oxford University Press 2008
Materias:	Ismb 2008 Conference Proceedings 19–23 July 2008, Toronto
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2718627/ https://www.ncbi.nlm.nih.gov/pubmed/18586704 http://dx.doi.org/10.1093/bioinformatics/btn149

_version_	1782170002645843968
author	Than, C. Sugino, R. Innan, H. Nakhleh, L.
author_facet	Than, C. Sugino, R. Innan, H. Nakhleh, L.
author_sort	Than, C.
collection	PubMed
description	Motivation: In bacterial evolution, inferring a strain tree, which is the evolutionary history of different strains of the same bacterium, plays a major role in analyzing and understanding the evolution of strongly isolated populations, population divergence and various evolutionary events, such as horizontal gene transfer and homologous recombination. Inferring a strain tree from multilocus data of these strains is exceptionally hard since, at this scale of evolution, processes such as homologous recombination result in a very high degree of gene tree incongruence. Results: In this article we present a novel computational method for inferring the strain tree despite massive gene tree incongruence caused by homologous recombination. Our method operates in three phases, where in phase I a set of candidate strain-tree topologies is computed using the maximal cliques concept, in phase II divergence times for each of the topologies are estimated using mixed integer linear programming (MILP) and in phase III the optimal tree (or trees) is selected based on an optimality criterion. We have analyzed 1898 genes from nine strains of the Staphylococcus aureus bacteria, and identified a fully resolved (binary) strain tree with estimated divergence times, despite the high degrees of sequence identity at the nucleotide level and gene tree incongruence. Our method's efficiency makes it particularly suitable for analysis of genome-scale datasets, including those of strongly isolated populations which are usually very challenging to analyze. Availability: We have implemented the algorithms in the PhyloNet software package, which is available publicly at http://bioinfo.cs.rice.edu/phylonet/ Contact: nakhleh@cs.rice.edu
format	Text
id	pubmed-2718627
institution	National Center for Biotechnology Information
language	English
publishDate	2008
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-27186272009-07-31 Efficient inference of bacterial strain trees from genome-scale multilocus data Than, C. Sugino, R. Innan, H. Nakhleh, L. Bioinformatics Ismb 2008 Conference Proceedings 19–23 July 2008, Toronto Motivation: In bacterial evolution, inferring a strain tree, which is the evolutionary history of different strains of the same bacterium, plays a major role in analyzing and understanding the evolution of strongly isolated populations, population divergence and various evolutionary events, such as horizontal gene transfer and homologous recombination. Inferring a strain tree from multilocus data of these strains is exceptionally hard since, at this scale of evolution, processes such as homologous recombination result in a very high degree of gene tree incongruence. Results: In this article we present a novel computational method for inferring the strain tree despite massive gene tree incongruence caused by homologous recombination. Our method operates in three phases, where in phase I a set of candidate strain-tree topologies is computed using the maximal cliques concept, in phase II divergence times for each of the topologies are estimated using mixed integer linear programming (MILP) and in phase III the optimal tree (or trees) is selected based on an optimality criterion. We have analyzed 1898 genes from nine strains of the Staphylococcus aureus bacteria, and identified a fully resolved (binary) strain tree with estimated divergence times, despite the high degrees of sequence identity at the nucleotide level and gene tree incongruence. Our method's efficiency makes it particularly suitable for analysis of genome-scale datasets, including those of strongly isolated populations which are usually very challenging to analyze. Availability: We have implemented the algorithms in the PhyloNet software package, which is available publicly at http://bioinfo.cs.rice.edu/phylonet/ Contact: nakhleh@cs.rice.edu Oxford University Press 2008-07-01 /pmc/articles/PMC2718627/ /pubmed/18586704 http://dx.doi.org/10.1093/bioinformatics/btn149 Text en © 2008 The Author(s) http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Ismb 2008 Conference Proceedings 19–23 July 2008, Toronto Than, C. Sugino, R. Innan, H. Nakhleh, L. Efficient inference of bacterial strain trees from genome-scale multilocus data
title	Efficient inference of bacterial strain trees from genome-scale multilocus data
title_full	Efficient inference of bacterial strain trees from genome-scale multilocus data
title_fullStr	Efficient inference of bacterial strain trees from genome-scale multilocus data
title_full_unstemmed	Efficient inference of bacterial strain trees from genome-scale multilocus data
title_short	Efficient inference of bacterial strain trees from genome-scale multilocus data
title_sort	efficient inference of bacterial strain trees from genome-scale multilocus data
topic	Ismb 2008 Conference Proceedings 19–23 July 2008, Toronto
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2718627/ https://www.ncbi.nlm.nih.gov/pubmed/18586704 http://dx.doi.org/10.1093/bioinformatics/btn149
work_keys_str_mv	AT thanc efficientinferenceofbacterialstraintreesfromgenomescalemultilocusdata AT suginor efficientinferenceofbacterialstraintreesfromgenomescalemultilocusdata AT innanh efficientinferenceofbacterialstraintreesfromgenomescalemultilocusdata AT nakhlehl efficientinferenceofbacterialstraintreesfromgenomescalemultilocusdata

Efficient inference of bacterial strain trees from genome-scale multilocus data

Ejemplares similares