Cargando…

Inference of reticulate evolutionary histories by maximum likelihood: the performance of information criteria

BACKGROUND: Maximum likelihood has been widely used for over three decades to infer phylogenetic trees from molecular data. When reticulate evolutionary events occur, several genomic regions may have conflicting evolutionary histories, and a phylogenetic network may provide a more adequate model for...

Descripción completa

Detalles Bibliográficos
Autores principales: Park, Hyun Jung, Nakhleh, Luay
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3526433/
https://www.ncbi.nlm.nih.gov/pubmed/23281614
http://dx.doi.org/10.1186/1471-2105-13-S19-S12
_version_ 1782253559263264768
author Park, Hyun Jung
Nakhleh, Luay
author_facet Park, Hyun Jung
Nakhleh, Luay
author_sort Park, Hyun Jung
collection PubMed
description BACKGROUND: Maximum likelihood has been widely used for over three decades to infer phylogenetic trees from molecular data. When reticulate evolutionary events occur, several genomic regions may have conflicting evolutionary histories, and a phylogenetic network may provide a more adequate model for representing the evolutionary history of the genomes or species. A maximum likelihood (ML) model has been proposed for this case and accounts for both mutation within a genomic region and reticulation across the regions. However, the performance of this model in terms of inferring information about reticulate evolution and properties that affect this performance have not been studied. RESULTS: In this paper, we study the effect of the evolutionary diameter and height of a reticulation event on its identifiability under ML. We find both of them, particularly the diameter, have a significant effect. Further, we find that the number of genes (which can be generalized to the concept of "non-recombining genomic regions") that are transferred across a reticulation edge affects its detectability. Last but not least, a fundamental challenge with phylogenetic networks is that they allow an arbitrary level of complexity, giving rise to the model selection problem. We investigate the performance of two information criteria, the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC), for addressing this problem. We find that BIC performs well in general for controlling the model complexity and preventing ML from grossly overestimating the number of reticulation events. CONCLUSION: Our results demonstrate that BIC provides a good framework for inferring reticulate evolutionary histories. Nevertheless, the results call for caution when interpreting the accuracy of the inference particularly for data sets with particular evolutionary features.
format Online
Article
Text
id pubmed-3526433
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35264332013-01-10 Inference of reticulate evolutionary histories by maximum likelihood: the performance of information criteria Park, Hyun Jung Nakhleh, Luay BMC Bioinformatics Proceedings BACKGROUND: Maximum likelihood has been widely used for over three decades to infer phylogenetic trees from molecular data. When reticulate evolutionary events occur, several genomic regions may have conflicting evolutionary histories, and a phylogenetic network may provide a more adequate model for representing the evolutionary history of the genomes or species. A maximum likelihood (ML) model has been proposed for this case and accounts for both mutation within a genomic region and reticulation across the regions. However, the performance of this model in terms of inferring information about reticulate evolution and properties that affect this performance have not been studied. RESULTS: In this paper, we study the effect of the evolutionary diameter and height of a reticulation event on its identifiability under ML. We find both of them, particularly the diameter, have a significant effect. Further, we find that the number of genes (which can be generalized to the concept of "non-recombining genomic regions") that are transferred across a reticulation edge affects its detectability. Last but not least, a fundamental challenge with phylogenetic networks is that they allow an arbitrary level of complexity, giving rise to the model selection problem. We investigate the performance of two information criteria, the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC), for addressing this problem. We find that BIC performs well in general for controlling the model complexity and preventing ML from grossly overestimating the number of reticulation events. CONCLUSION: Our results demonstrate that BIC provides a good framework for inferring reticulate evolutionary histories. Nevertheless, the results call for caution when interpreting the accuracy of the inference particularly for data sets with particular evolutionary features. BioMed Central 2012-12-19 /pmc/articles/PMC3526433/ /pubmed/23281614 http://dx.doi.org/10.1186/1471-2105-13-S19-S12 Text en Copyright ©2012 Park and Nakhleh; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Park, Hyun Jung
Nakhleh, Luay
Inference of reticulate evolutionary histories by maximum likelihood: the performance of information criteria
title Inference of reticulate evolutionary histories by maximum likelihood: the performance of information criteria
title_full Inference of reticulate evolutionary histories by maximum likelihood: the performance of information criteria
title_fullStr Inference of reticulate evolutionary histories by maximum likelihood: the performance of information criteria
title_full_unstemmed Inference of reticulate evolutionary histories by maximum likelihood: the performance of information criteria
title_short Inference of reticulate evolutionary histories by maximum likelihood: the performance of information criteria
title_sort inference of reticulate evolutionary histories by maximum likelihood: the performance of information criteria
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3526433/
https://www.ncbi.nlm.nih.gov/pubmed/23281614
http://dx.doi.org/10.1186/1471-2105-13-S19-S12
work_keys_str_mv AT parkhyunjung inferenceofreticulateevolutionaryhistoriesbymaximumlikelihoodtheperformanceofinformationcriteria
AT nakhlehluay inferenceofreticulateevolutionaryhistoriesbymaximumlikelihoodtheperformanceofinformationcriteria