Cargando…

Gene tree parsimony for incomplete gene trees: addressing true biological loss

MOTIVATION: Species tree estimation from gene trees can be complicated by gene duplication and loss, and “gene tree parsimony” (GTP) is one approach for estimating species trees from multiple gene trees. In its standard formulation, the objective is to find a species tree that minimizes the total nu...

Descripción completa

Detalles Bibliográficos
Autores principales: Bayzid, Md Shamsuzzoha, Warnow, Tandy
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5774205/
https://www.ncbi.nlm.nih.gov/pubmed/29387142
http://dx.doi.org/10.1186/s13015-017-0120-1
_version_ 1783293721506217984
author Bayzid, Md Shamsuzzoha
Warnow, Tandy
author_facet Bayzid, Md Shamsuzzoha
Warnow, Tandy
author_sort Bayzid, Md Shamsuzzoha
collection PubMed
description MOTIVATION: Species tree estimation from gene trees can be complicated by gene duplication and loss, and “gene tree parsimony” (GTP) is one approach for estimating species trees from multiple gene trees. In its standard formulation, the objective is to find a species tree that minimizes the total number of gene duplications and losses with respect to the input set of gene trees. Although much is known about GTP, little is known about how to treat inputs containing some incomplete gene trees (i.e., gene trees lacking one or more of the species). RESULTS: We present new theory for GTP considering whether the incompleteness is due to gene birth and death (i.e., true biological loss) or taxon sampling, and present dynamic programming algorithms that can be used for an exact but exponential time solution for small numbers of taxa, or as a heuristic for larger numbers of taxa. We also prove that the “standard” calculations for duplications and losses exactly solve GTP when incompleteness results from taxon sampling, although they can be incorrect when incompleteness results from true biological loss. The software for the DP algorithm is freely available as open source code at https://github.com/smirarab/DynaDup.
format Online
Article
Text
id pubmed-5774205
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-57742052018-01-31 Gene tree parsimony for incomplete gene trees: addressing true biological loss Bayzid, Md Shamsuzzoha Warnow, Tandy Algorithms Mol Biol Research MOTIVATION: Species tree estimation from gene trees can be complicated by gene duplication and loss, and “gene tree parsimony” (GTP) is one approach for estimating species trees from multiple gene trees. In its standard formulation, the objective is to find a species tree that minimizes the total number of gene duplications and losses with respect to the input set of gene trees. Although much is known about GTP, little is known about how to treat inputs containing some incomplete gene trees (i.e., gene trees lacking one or more of the species). RESULTS: We present new theory for GTP considering whether the incompleteness is due to gene birth and death (i.e., true biological loss) or taxon sampling, and present dynamic programming algorithms that can be used for an exact but exponential time solution for small numbers of taxa, or as a heuristic for larger numbers of taxa. We also prove that the “standard” calculations for duplications and losses exactly solve GTP when incompleteness results from taxon sampling, although they can be incorrect when incompleteness results from true biological loss. The software for the DP algorithm is freely available as open source code at https://github.com/smirarab/DynaDup. BioMed Central 2018-01-19 /pmc/articles/PMC5774205/ /pubmed/29387142 http://dx.doi.org/10.1186/s13015-017-0120-1 Text en © The Author(s) 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Bayzid, Md Shamsuzzoha
Warnow, Tandy
Gene tree parsimony for incomplete gene trees: addressing true biological loss
title Gene tree parsimony for incomplete gene trees: addressing true biological loss
title_full Gene tree parsimony for incomplete gene trees: addressing true biological loss
title_fullStr Gene tree parsimony for incomplete gene trees: addressing true biological loss
title_full_unstemmed Gene tree parsimony for incomplete gene trees: addressing true biological loss
title_short Gene tree parsimony for incomplete gene trees: addressing true biological loss
title_sort gene tree parsimony for incomplete gene trees: addressing true biological loss
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5774205/
https://www.ncbi.nlm.nih.gov/pubmed/29387142
http://dx.doi.org/10.1186/s13015-017-0120-1
work_keys_str_mv AT bayzidmdshamsuzzoha genetreeparsimonyforincompletegenetreesaddressingtruebiologicalloss
AT warnowtandy genetreeparsimonyforincompletegenetreesaddressingtruebiologicalloss