Cargando…
EM for phylogenetic topology reconstruction on nonhomogeneous data
BACKGROUND: The reconstruction of the phylogenetic tree topology of four taxa is, still nowadays, one of the main challenges in phylogenetics. Its difficulties lie in considering not too restrictive evolutionary models, and correctly dealing with the long-branch attraction problem. The correct recon...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4074583/ https://www.ncbi.nlm.nih.gov/pubmed/24938507 http://dx.doi.org/10.1186/1471-2148-14-132 |
_version_ | 1782323227033337856 |
---|---|
author | Ibáñez-Marcelo, Esther Casanellas, Marta |
author_facet | Ibáñez-Marcelo, Esther Casanellas, Marta |
author_sort | Ibáñez-Marcelo, Esther |
collection | PubMed |
description | BACKGROUND: The reconstruction of the phylogenetic tree topology of four taxa is, still nowadays, one of the main challenges in phylogenetics. Its difficulties lie in considering not too restrictive evolutionary models, and correctly dealing with the long-branch attraction problem. The correct reconstruction of 4-taxon trees is crucial for making quartet-based methods work and being able to recover large phylogenies. METHODS: We adapt the well known expectation-maximization algorithm to evolutionary Markov models on phylogenetic 4-taxon trees. We then use this algorithm to estimate the substitution parameters, compute the corresponding likelihood, and to infer the most likely quartet. RESULTS: In this paper we consider an expectation-maximization method for maximizing the likelihood of (time nonhomogeneous) evolutionary Markov models on trees. We study its success on reconstructing 4-taxon topologies and its performance as input method in quartet-based phylogenetic reconstruction methods such as QFIT and QuartetSuite. Our results show that the method proposed here outperforms neighbor-joining and the usual (time-homogeneous continuous-time) maximum likelihood methods on 4-leaved trees with among-lineage instantaneous rate heterogeneity, and perform similarly to usual continuous-time maximum-likelihood when data satisfies the assumptions of both methods. CONCLUSIONS: The method presented in this paper is well suited for reconstructing the topology of any number of taxa via quartet-based methods and is highly accurate, specially regarding largely divergent trees and time nonhomogeneous data. |
format | Online Article Text |
id | pubmed-4074583 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-40745832014-07-01 EM for phylogenetic topology reconstruction on nonhomogeneous data Ibáñez-Marcelo, Esther Casanellas, Marta BMC Evol Biol Methodology Article BACKGROUND: The reconstruction of the phylogenetic tree topology of four taxa is, still nowadays, one of the main challenges in phylogenetics. Its difficulties lie in considering not too restrictive evolutionary models, and correctly dealing with the long-branch attraction problem. The correct reconstruction of 4-taxon trees is crucial for making quartet-based methods work and being able to recover large phylogenies. METHODS: We adapt the well known expectation-maximization algorithm to evolutionary Markov models on phylogenetic 4-taxon trees. We then use this algorithm to estimate the substitution parameters, compute the corresponding likelihood, and to infer the most likely quartet. RESULTS: In this paper we consider an expectation-maximization method for maximizing the likelihood of (time nonhomogeneous) evolutionary Markov models on trees. We study its success on reconstructing 4-taxon topologies and its performance as input method in quartet-based phylogenetic reconstruction methods such as QFIT and QuartetSuite. Our results show that the method proposed here outperforms neighbor-joining and the usual (time-homogeneous continuous-time) maximum likelihood methods on 4-leaved trees with among-lineage instantaneous rate heterogeneity, and perform similarly to usual continuous-time maximum-likelihood when data satisfies the assumptions of both methods. CONCLUSIONS: The method presented in this paper is well suited for reconstructing the topology of any number of taxa via quartet-based methods and is highly accurate, specially regarding largely divergent trees and time nonhomogeneous data. BioMed Central 2014-06-17 /pmc/articles/PMC4074583/ /pubmed/24938507 http://dx.doi.org/10.1186/1471-2148-14-132 Text en Copyright © 2014 Ibáñez-Marcelo and Casanellas; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Ibáñez-Marcelo, Esther Casanellas, Marta EM for phylogenetic topology reconstruction on nonhomogeneous data |
title | EM for phylogenetic topology reconstruction on nonhomogeneous data |
title_full | EM for phylogenetic topology reconstruction on nonhomogeneous data |
title_fullStr | EM for phylogenetic topology reconstruction on nonhomogeneous data |
title_full_unstemmed | EM for phylogenetic topology reconstruction on nonhomogeneous data |
title_short | EM for phylogenetic topology reconstruction on nonhomogeneous data |
title_sort | em for phylogenetic topology reconstruction on nonhomogeneous data |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4074583/ https://www.ncbi.nlm.nih.gov/pubmed/24938507 http://dx.doi.org/10.1186/1471-2148-14-132 |
work_keys_str_mv | AT ibanezmarceloesther emforphylogenetictopologyreconstructiononnonhomogeneousdata AT casanellasmarta emforphylogenetictopologyreconstructiononnonhomogeneousdata |