Cargando…
SpeciesRax: A Tool for Maximum Likelihood Species Tree Inference from Gene Family Trees under Duplication, Transfer, and Loss
Species tree inference from gene family trees is becoming increasingly popular because it can account for discordance between the species tree and the corresponding gene family trees. In particular, methods that can account for multiple-copy gene families exhibit potential to leverage paralogy as in...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8826479/ https://www.ncbi.nlm.nih.gov/pubmed/35021210 http://dx.doi.org/10.1093/molbev/msab365 |
_version_ | 1784647441049452544 |
---|---|
author | Morel, Benoit Schade, Paul Lutteropp, Sarah Williams, Tom A Szöllősi, Gergely J Stamatakis, Alexandros |
author_facet | Morel, Benoit Schade, Paul Lutteropp, Sarah Williams, Tom A Szöllősi, Gergely J Stamatakis, Alexandros |
author_sort | Morel, Benoit |
collection | PubMed |
description | Species tree inference from gene family trees is becoming increasingly popular because it can account for discordance between the species tree and the corresponding gene family trees. In particular, methods that can account for multiple-copy gene families exhibit potential to leverage paralogy as informative signal. At present, there does not exist any widely adopted inference method for this purpose. Here, we present SpeciesRax, the first maximum likelihood method that can infer a rooted species tree from a set of gene family trees and can account for gene duplication, loss, and transfer events. By explicitly modeling events by which gene trees can depart from the species tree, SpeciesRax leverages the phylogenetic rooting signal in gene trees. SpeciesRax infers species tree branch lengths in units of expected substitutions per site and branch support values via paralogy-aware quartets extracted from the gene family trees. Using both empirical and simulated data sets we show that SpeciesRax is at least as accurate as the best competing methods while being one order of magnitude faster on large data sets at the same time. We used SpeciesRax to infer a biologically plausible rooted phylogeny of the vertebrates comprising 188 species from 31,612 gene families in 1 h using 40 cores. SpeciesRax is available under GNU GPL at https://github.com/BenoitMorel/GeneRax and on BioConda. |
format | Online Article Text |
id | pubmed-8826479 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-88264792022-02-09 SpeciesRax: A Tool for Maximum Likelihood Species Tree Inference from Gene Family Trees under Duplication, Transfer, and Loss Morel, Benoit Schade, Paul Lutteropp, Sarah Williams, Tom A Szöllősi, Gergely J Stamatakis, Alexandros Mol Biol Evol Methods Species tree inference from gene family trees is becoming increasingly popular because it can account for discordance between the species tree and the corresponding gene family trees. In particular, methods that can account for multiple-copy gene families exhibit potential to leverage paralogy as informative signal. At present, there does not exist any widely adopted inference method for this purpose. Here, we present SpeciesRax, the first maximum likelihood method that can infer a rooted species tree from a set of gene family trees and can account for gene duplication, loss, and transfer events. By explicitly modeling events by which gene trees can depart from the species tree, SpeciesRax leverages the phylogenetic rooting signal in gene trees. SpeciesRax infers species tree branch lengths in units of expected substitutions per site and branch support values via paralogy-aware quartets extracted from the gene family trees. Using both empirical and simulated data sets we show that SpeciesRax is at least as accurate as the best competing methods while being one order of magnitude faster on large data sets at the same time. We used SpeciesRax to infer a biologically plausible rooted phylogeny of the vertebrates comprising 188 species from 31,612 gene families in 1 h using 40 cores. SpeciesRax is available under GNU GPL at https://github.com/BenoitMorel/GeneRax and on BioConda. Oxford University Press 2022-01-11 /pmc/articles/PMC8826479/ /pubmed/35021210 http://dx.doi.org/10.1093/molbev/msab365 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Methods Morel, Benoit Schade, Paul Lutteropp, Sarah Williams, Tom A Szöllősi, Gergely J Stamatakis, Alexandros SpeciesRax: A Tool for Maximum Likelihood Species Tree Inference from Gene Family Trees under Duplication, Transfer, and Loss |
title | SpeciesRax: A Tool for Maximum Likelihood Species Tree Inference from Gene Family Trees under Duplication, Transfer, and Loss |
title_full | SpeciesRax: A Tool for Maximum Likelihood Species Tree Inference from Gene Family Trees under Duplication, Transfer, and Loss |
title_fullStr | SpeciesRax: A Tool for Maximum Likelihood Species Tree Inference from Gene Family Trees under Duplication, Transfer, and Loss |
title_full_unstemmed | SpeciesRax: A Tool for Maximum Likelihood Species Tree Inference from Gene Family Trees under Duplication, Transfer, and Loss |
title_short | SpeciesRax: A Tool for Maximum Likelihood Species Tree Inference from Gene Family Trees under Duplication, Transfer, and Loss |
title_sort | speciesrax: a tool for maximum likelihood species tree inference from gene family trees under duplication, transfer, and loss |
topic | Methods |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8826479/ https://www.ncbi.nlm.nih.gov/pubmed/35021210 http://dx.doi.org/10.1093/molbev/msab365 |
work_keys_str_mv | AT morelbenoit speciesraxatoolformaximumlikelihoodspeciestreeinferencefromgenefamilytreesunderduplicationtransferandloss AT schadepaul speciesraxatoolformaximumlikelihoodspeciestreeinferencefromgenefamilytreesunderduplicationtransferandloss AT lutteroppsarah speciesraxatoolformaximumlikelihoodspeciestreeinferencefromgenefamilytreesunderduplicationtransferandloss AT williamstoma speciesraxatoolformaximumlikelihoodspeciestreeinferencefromgenefamilytreesunderduplicationtransferandloss AT szollosigergelyj speciesraxatoolformaximumlikelihoodspeciestreeinferencefromgenefamilytreesunderduplicationtransferandloss AT stamatakisalexandros speciesraxatoolformaximumlikelihoodspeciestreeinferencefromgenefamilytreesunderduplicationtransferandloss |