Cargando…
Species Tree Estimation and the Impact of Gene Loss Following Whole-Genome Duplication
Whole-genome duplication (WGD) occurs broadly and repeatedly across the history of eukaryotes and is recognized as a prominent evolutionary force, especially in plants. Immediately following WGD, most genes are present in two copies as paralogs. Due to this redundancy, one copy of a paralog pair com...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9558847/ https://www.ncbi.nlm.nih.gov/pubmed/35689633 http://dx.doi.org/10.1093/sysbio/syac040 |
_version_ | 1784807533388496896 |
---|---|
author | Xiong, Haifeng Wang, Danying Shao, Chen Yang, Xuchen Yang, Jialin Ma, Tao Davis, Charles C Liu, Liang Xi, Zhenxiang |
author_facet | Xiong, Haifeng Wang, Danying Shao, Chen Yang, Xuchen Yang, Jialin Ma, Tao Davis, Charles C Liu, Liang Xi, Zhenxiang |
author_sort | Xiong, Haifeng |
collection | PubMed |
description | Whole-genome duplication (WGD) occurs broadly and repeatedly across the history of eukaryotes and is recognized as a prominent evolutionary force, especially in plants. Immediately following WGD, most genes are present in two copies as paralogs. Due to this redundancy, one copy of a paralog pair commonly undergoes pseudogenization and is eventually lost. When speciation occurs shortly after WGD; however, differential loss of paralogs may lead to spurious phylogenetic inference resulting from the inclusion of pseudoorthologs–paralogous genes mistakenly identified as orthologs because they are present in single copies within each sampled species. The influence and impact of including pseudoorthologs versus true orthologs as a result of gene extinction (or incomplete laboratory sampling) are only recently gaining empirical attention in the phylogenomics community. Moreover, few studies have yet to investigate this phenomenon in an explicit coalescent framework. Here, using mathematical models, numerous simulated data sets, and two newly assembled empirical data sets, we assess the effect of pseudoorthologs on species tree estimation under varying degrees of incomplete lineage sorting (ILS) and differential gene loss scenarios following WGD. When gene loss occurs along the terminal branches of the species tree, alignment-based (BPP) and gene-tree-based (ASTRAL, MP-EST, and STAR) coalescent methods are adversely affected as the degree of ILS increases. This can be greatly improved by sampling a sufficiently large number of genes. Under the same circumstances, however, concatenation methods consistently estimate incorrect species trees as the number of genes increases. Additionally, pseudoorthologs can greatly mislead species tree inference when gene loss occurs along the internal branches of the species tree. Here, both coalescent and concatenation methods yield inconsistent results. These results underscore the importance of understanding the influence of pseudoorthologs in the phylogenomics era. [Coalescent method; concatenation method; incomplete lineage sorting; pseudoorthologs; single-copy gene; whole-genome duplication.] |
format | Online Article Text |
id | pubmed-9558847 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-95588472022-10-18 Species Tree Estimation and the Impact of Gene Loss Following Whole-Genome Duplication Xiong, Haifeng Wang, Danying Shao, Chen Yang, Xuchen Yang, Jialin Ma, Tao Davis, Charles C Liu, Liang Xi, Zhenxiang Syst Biol Regular Articles Whole-genome duplication (WGD) occurs broadly and repeatedly across the history of eukaryotes and is recognized as a prominent evolutionary force, especially in plants. Immediately following WGD, most genes are present in two copies as paralogs. Due to this redundancy, one copy of a paralog pair commonly undergoes pseudogenization and is eventually lost. When speciation occurs shortly after WGD; however, differential loss of paralogs may lead to spurious phylogenetic inference resulting from the inclusion of pseudoorthologs–paralogous genes mistakenly identified as orthologs because they are present in single copies within each sampled species. The influence and impact of including pseudoorthologs versus true orthologs as a result of gene extinction (or incomplete laboratory sampling) are only recently gaining empirical attention in the phylogenomics community. Moreover, few studies have yet to investigate this phenomenon in an explicit coalescent framework. Here, using mathematical models, numerous simulated data sets, and two newly assembled empirical data sets, we assess the effect of pseudoorthologs on species tree estimation under varying degrees of incomplete lineage sorting (ILS) and differential gene loss scenarios following WGD. When gene loss occurs along the terminal branches of the species tree, alignment-based (BPP) and gene-tree-based (ASTRAL, MP-EST, and STAR) coalescent methods are adversely affected as the degree of ILS increases. This can be greatly improved by sampling a sufficiently large number of genes. Under the same circumstances, however, concatenation methods consistently estimate incorrect species trees as the number of genes increases. Additionally, pseudoorthologs can greatly mislead species tree inference when gene loss occurs along the internal branches of the species tree. Here, both coalescent and concatenation methods yield inconsistent results. These results underscore the importance of understanding the influence of pseudoorthologs in the phylogenomics era. [Coalescent method; concatenation method; incomplete lineage sorting; pseudoorthologs; single-copy gene; whole-genome duplication.] Oxford University Press 2022-06-11 /pmc/articles/PMC9558847/ /pubmed/35689633 http://dx.doi.org/10.1093/sysbio/syac040 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of the Society of Systematic Biologists. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Regular Articles Xiong, Haifeng Wang, Danying Shao, Chen Yang, Xuchen Yang, Jialin Ma, Tao Davis, Charles C Liu, Liang Xi, Zhenxiang Species Tree Estimation and the Impact of Gene Loss Following Whole-Genome Duplication |
title | Species Tree Estimation and the Impact of Gene Loss Following Whole-Genome Duplication |
title_full | Species Tree Estimation and the Impact of Gene Loss Following Whole-Genome Duplication |
title_fullStr | Species Tree Estimation and the Impact of Gene Loss Following Whole-Genome Duplication |
title_full_unstemmed | Species Tree Estimation and the Impact of Gene Loss Following Whole-Genome Duplication |
title_short | Species Tree Estimation and the Impact of Gene Loss Following Whole-Genome Duplication |
title_sort | species tree estimation and the impact of gene loss following whole-genome duplication |
topic | Regular Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9558847/ https://www.ncbi.nlm.nih.gov/pubmed/35689633 http://dx.doi.org/10.1093/sysbio/syac040 |
work_keys_str_mv | AT xionghaifeng speciestreeestimationandtheimpactofgenelossfollowingwholegenomeduplication AT wangdanying speciestreeestimationandtheimpactofgenelossfollowingwholegenomeduplication AT shaochen speciestreeestimationandtheimpactofgenelossfollowingwholegenomeduplication AT yangxuchen speciestreeestimationandtheimpactofgenelossfollowingwholegenomeduplication AT yangjialin speciestreeestimationandtheimpactofgenelossfollowingwholegenomeduplication AT matao speciestreeestimationandtheimpactofgenelossfollowingwholegenomeduplication AT davischarlesc speciestreeestimationandtheimpactofgenelossfollowingwholegenomeduplication AT liuliang speciestreeestimationandtheimpactofgenelossfollowingwholegenomeduplication AT xizhenxiang speciestreeestimationandtheimpactofgenelossfollowingwholegenomeduplication |