Cargando…

Maximum likelihood models and algorithms for gene tree evolution with duplications and losses

BACKGROUND: The abundance of new genomic data provides the opportunity to map the location of gene duplication and loss events on a species phylogeny. The first methods for mapping gene duplications and losses were based on a parsimony criterion, finding the mapping that minimizes the number of dupl...

Descripción completa

Detalles Bibliográficos
Autores principales: Górecki, Pawel, Burleigh, Gordon J, Eulenstein, Oliver
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3044269/
https://www.ncbi.nlm.nih.gov/pubmed/21342544
http://dx.doi.org/10.1186/1471-2105-12-S1-S15
_version_ 1782198707150651392
author Górecki, Pawel
Burleigh, Gordon J
Eulenstein, Oliver
author_facet Górecki, Pawel
Burleigh, Gordon J
Eulenstein, Oliver
author_sort Górecki, Pawel
collection PubMed
description BACKGROUND: The abundance of new genomic data provides the opportunity to map the location of gene duplication and loss events on a species phylogeny. The first methods for mapping gene duplications and losses were based on a parsimony criterion, finding the mapping that minimizes the number of duplication and loss events. Probabilistic modeling of gene duplication and loss is relatively new and has largely focused on birth-death processes. RESULTS: We introduce a new maximum likelihood model that estimates the speciation and gene duplication and loss events in a gene tree within a species tree with branch lengths. We also provide an, in practice, efficient algorithm that computes optimal evolutionary scenarios for this model. We implemented the algorithm in the program DrML and verified its performance with empirical and simulated data. CONCLUSIONS: In test data sets, DrML finds optimal gene duplication and loss scenarios within minutes, even when the gene trees contain sequences from several hundred species. In many cases, these optimal scenarios differ from the lca-mapping that results from a parsimony gene tree reconciliation. Thus, DrML provides a new, practical statistical framework on which to study gene duplication.
format Text
id pubmed-3044269
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30442692011-02-25 Maximum likelihood models and algorithms for gene tree evolution with duplications and losses Górecki, Pawel Burleigh, Gordon J Eulenstein, Oliver BMC Bioinformatics Research BACKGROUND: The abundance of new genomic data provides the opportunity to map the location of gene duplication and loss events on a species phylogeny. The first methods for mapping gene duplications and losses were based on a parsimony criterion, finding the mapping that minimizes the number of duplication and loss events. Probabilistic modeling of gene duplication and loss is relatively new and has largely focused on birth-death processes. RESULTS: We introduce a new maximum likelihood model that estimates the speciation and gene duplication and loss events in a gene tree within a species tree with branch lengths. We also provide an, in practice, efficient algorithm that computes optimal evolutionary scenarios for this model. We implemented the algorithm in the program DrML and verified its performance with empirical and simulated data. CONCLUSIONS: In test data sets, DrML finds optimal gene duplication and loss scenarios within minutes, even when the gene trees contain sequences from several hundred species. In many cases, these optimal scenarios differ from the lca-mapping that results from a parsimony gene tree reconciliation. Thus, DrML provides a new, practical statistical framework on which to study gene duplication. BioMed Central 2011-02-15 /pmc/articles/PMC3044269/ /pubmed/21342544 http://dx.doi.org/10.1186/1471-2105-12-S1-S15 Text en Copyright ©2011 Górecki et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Górecki, Pawel
Burleigh, Gordon J
Eulenstein, Oliver
Maximum likelihood models and algorithms for gene tree evolution with duplications and losses
title Maximum likelihood models and algorithms for gene tree evolution with duplications and losses
title_full Maximum likelihood models and algorithms for gene tree evolution with duplications and losses
title_fullStr Maximum likelihood models and algorithms for gene tree evolution with duplications and losses
title_full_unstemmed Maximum likelihood models and algorithms for gene tree evolution with duplications and losses
title_short Maximum likelihood models and algorithms for gene tree evolution with duplications and losses
title_sort maximum likelihood models and algorithms for gene tree evolution with duplications and losses
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3044269/
https://www.ncbi.nlm.nih.gov/pubmed/21342544
http://dx.doi.org/10.1186/1471-2105-12-S1-S15
work_keys_str_mv AT goreckipawel maximumlikelihoodmodelsandalgorithmsforgenetreeevolutionwithduplicationsandlosses
AT burleighgordonj maximumlikelihoodmodelsandalgorithmsforgenetreeevolutionwithduplicationsandlosses
AT eulensteinoliver maximumlikelihoodmodelsandalgorithmsforgenetreeevolutionwithduplicationsandlosses