Cargando…

EM for phylogenetic topology reconstruction on nonhomogeneous data

BACKGROUND: The reconstruction of the phylogenetic tree topology of four taxa is, still nowadays, one of the main challenges in phylogenetics. Its difficulties lie in considering not too restrictive evolutionary models, and correctly dealing with the long-branch attraction problem. The correct recon...

Descripción completa

Detalles Bibliográficos
Autores principales: Ibáñez-Marcelo, Esther, Casanellas, Marta
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4074583/
https://www.ncbi.nlm.nih.gov/pubmed/24938507
http://dx.doi.org/10.1186/1471-2148-14-132
_version_ 1782323227033337856
author Ibáñez-Marcelo, Esther
Casanellas, Marta
author_facet Ibáñez-Marcelo, Esther
Casanellas, Marta
author_sort Ibáñez-Marcelo, Esther
collection PubMed
description BACKGROUND: The reconstruction of the phylogenetic tree topology of four taxa is, still nowadays, one of the main challenges in phylogenetics. Its difficulties lie in considering not too restrictive evolutionary models, and correctly dealing with the long-branch attraction problem. The correct reconstruction of 4-taxon trees is crucial for making quartet-based methods work and being able to recover large phylogenies. METHODS: We adapt the well known expectation-maximization algorithm to evolutionary Markov models on phylogenetic 4-taxon trees. We then use this algorithm to estimate the substitution parameters, compute the corresponding likelihood, and to infer the most likely quartet. RESULTS: In this paper we consider an expectation-maximization method for maximizing the likelihood of (time nonhomogeneous) evolutionary Markov models on trees. We study its success on reconstructing 4-taxon topologies and its performance as input method in quartet-based phylogenetic reconstruction methods such as QFIT and QuartetSuite. Our results show that the method proposed here outperforms neighbor-joining and the usual (time-homogeneous continuous-time) maximum likelihood methods on 4-leaved trees with among-lineage instantaneous rate heterogeneity, and perform similarly to usual continuous-time maximum-likelihood when data satisfies the assumptions of both methods. CONCLUSIONS: The method presented in this paper is well suited for reconstructing the topology of any number of taxa via quartet-based methods and is highly accurate, specially regarding largely divergent trees and time nonhomogeneous data.
format Online
Article
Text
id pubmed-4074583
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-40745832014-07-01 EM for phylogenetic topology reconstruction on nonhomogeneous data Ibáñez-Marcelo, Esther Casanellas, Marta BMC Evol Biol Methodology Article BACKGROUND: The reconstruction of the phylogenetic tree topology of four taxa is, still nowadays, one of the main challenges in phylogenetics. Its difficulties lie in considering not too restrictive evolutionary models, and correctly dealing with the long-branch attraction problem. The correct reconstruction of 4-taxon trees is crucial for making quartet-based methods work and being able to recover large phylogenies. METHODS: We adapt the well known expectation-maximization algorithm to evolutionary Markov models on phylogenetic 4-taxon trees. We then use this algorithm to estimate the substitution parameters, compute the corresponding likelihood, and to infer the most likely quartet. RESULTS: In this paper we consider an expectation-maximization method for maximizing the likelihood of (time nonhomogeneous) evolutionary Markov models on trees. We study its success on reconstructing 4-taxon topologies and its performance as input method in quartet-based phylogenetic reconstruction methods such as QFIT and QuartetSuite. Our results show that the method proposed here outperforms neighbor-joining and the usual (time-homogeneous continuous-time) maximum likelihood methods on 4-leaved trees with among-lineage instantaneous rate heterogeneity, and perform similarly to usual continuous-time maximum-likelihood when data satisfies the assumptions of both methods. CONCLUSIONS: The method presented in this paper is well suited for reconstructing the topology of any number of taxa via quartet-based methods and is highly accurate, specially regarding largely divergent trees and time nonhomogeneous data. BioMed Central 2014-06-17 /pmc/articles/PMC4074583/ /pubmed/24938507 http://dx.doi.org/10.1186/1471-2148-14-132 Text en Copyright © 2014 Ibáñez-Marcelo and Casanellas; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Ibáñez-Marcelo, Esther
Casanellas, Marta
EM for phylogenetic topology reconstruction on nonhomogeneous data
title EM for phylogenetic topology reconstruction on nonhomogeneous data
title_full EM for phylogenetic topology reconstruction on nonhomogeneous data
title_fullStr EM for phylogenetic topology reconstruction on nonhomogeneous data
title_full_unstemmed EM for phylogenetic topology reconstruction on nonhomogeneous data
title_short EM for phylogenetic topology reconstruction on nonhomogeneous data
title_sort em for phylogenetic topology reconstruction on nonhomogeneous data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4074583/
https://www.ncbi.nlm.nih.gov/pubmed/24938507
http://dx.doi.org/10.1186/1471-2148-14-132
work_keys_str_mv AT ibanezmarceloesther emforphylogenetictopologyreconstructiononnonhomogeneousdata
AT casanellasmarta emforphylogenetictopologyreconstructiononnonhomogeneousdata