Cargando…

Species Tree Estimation and the Impact of Gene Loss Following Whole-Genome Duplication

Whole-genome duplication (WGD) occurs broadly and repeatedly across the history of eukaryotes and is recognized as a prominent evolutionary force, especially in plants. Immediately following WGD, most genes are present in two copies as paralogs. Due to this redundancy, one copy of a paralog pair com...

Descripción completa

Detalles Bibliográficos
Autores principales: Xiong, Haifeng, Wang, Danying, Shao, Chen, Yang, Xuchen, Yang, Jialin, Ma, Tao, Davis, Charles C, Liu, Liang, Xi, Zhenxiang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9558847/
https://www.ncbi.nlm.nih.gov/pubmed/35689633
http://dx.doi.org/10.1093/sysbio/syac040
_version_ 1784807533388496896
author Xiong, Haifeng
Wang, Danying
Shao, Chen
Yang, Xuchen
Yang, Jialin
Ma, Tao
Davis, Charles C
Liu, Liang
Xi, Zhenxiang
author_facet Xiong, Haifeng
Wang, Danying
Shao, Chen
Yang, Xuchen
Yang, Jialin
Ma, Tao
Davis, Charles C
Liu, Liang
Xi, Zhenxiang
author_sort Xiong, Haifeng
collection PubMed
description Whole-genome duplication (WGD) occurs broadly and repeatedly across the history of eukaryotes and is recognized as a prominent evolutionary force, especially in plants. Immediately following WGD, most genes are present in two copies as paralogs. Due to this redundancy, one copy of a paralog pair commonly undergoes pseudogenization and is eventually lost. When speciation occurs shortly after WGD; however, differential loss of paralogs may lead to spurious phylogenetic inference resulting from the inclusion of pseudoorthologs–paralogous genes mistakenly identified as orthologs because they are present in single copies within each sampled species. The influence and impact of including pseudoorthologs versus true orthologs as a result of gene extinction (or incomplete laboratory sampling) are only recently gaining empirical attention in the phylogenomics community. Moreover, few studies have yet to investigate this phenomenon in an explicit coalescent framework. Here, using mathematical models, numerous simulated data sets, and two newly assembled empirical data sets, we assess the effect of pseudoorthologs on species tree estimation under varying degrees of incomplete lineage sorting (ILS) and differential gene loss scenarios following WGD. When gene loss occurs along the terminal branches of the species tree, alignment-based (BPP) and gene-tree-based (ASTRAL, MP-EST, and STAR) coalescent methods are adversely affected as the degree of ILS increases. This can be greatly improved by sampling a sufficiently large number of genes. Under the same circumstances, however, concatenation methods consistently estimate incorrect species trees as the number of genes increases. Additionally, pseudoorthologs can greatly mislead species tree inference when gene loss occurs along the internal branches of the species tree. Here, both coalescent and concatenation methods yield inconsistent results. These results underscore the importance of understanding the influence of pseudoorthologs in the phylogenomics era. [Coalescent method; concatenation method; incomplete lineage sorting; pseudoorthologs; single-copy gene; whole-genome duplication.]
format Online
Article
Text
id pubmed-9558847
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-95588472022-10-18 Species Tree Estimation and the Impact of Gene Loss Following Whole-Genome Duplication Xiong, Haifeng Wang, Danying Shao, Chen Yang, Xuchen Yang, Jialin Ma, Tao Davis, Charles C Liu, Liang Xi, Zhenxiang Syst Biol Regular Articles Whole-genome duplication (WGD) occurs broadly and repeatedly across the history of eukaryotes and is recognized as a prominent evolutionary force, especially in plants. Immediately following WGD, most genes are present in two copies as paralogs. Due to this redundancy, one copy of a paralog pair commonly undergoes pseudogenization and is eventually lost. When speciation occurs shortly after WGD; however, differential loss of paralogs may lead to spurious phylogenetic inference resulting from the inclusion of pseudoorthologs–paralogous genes mistakenly identified as orthologs because they are present in single copies within each sampled species. The influence and impact of including pseudoorthologs versus true orthologs as a result of gene extinction (or incomplete laboratory sampling) are only recently gaining empirical attention in the phylogenomics community. Moreover, few studies have yet to investigate this phenomenon in an explicit coalescent framework. Here, using mathematical models, numerous simulated data sets, and two newly assembled empirical data sets, we assess the effect of pseudoorthologs on species tree estimation under varying degrees of incomplete lineage sorting (ILS) and differential gene loss scenarios following WGD. When gene loss occurs along the terminal branches of the species tree, alignment-based (BPP) and gene-tree-based (ASTRAL, MP-EST, and STAR) coalescent methods are adversely affected as the degree of ILS increases. This can be greatly improved by sampling a sufficiently large number of genes. Under the same circumstances, however, concatenation methods consistently estimate incorrect species trees as the number of genes increases. Additionally, pseudoorthologs can greatly mislead species tree inference when gene loss occurs along the internal branches of the species tree. Here, both coalescent and concatenation methods yield inconsistent results. These results underscore the importance of understanding the influence of pseudoorthologs in the phylogenomics era. [Coalescent method; concatenation method; incomplete lineage sorting; pseudoorthologs; single-copy gene; whole-genome duplication.] Oxford University Press 2022-06-11 /pmc/articles/PMC9558847/ /pubmed/35689633 http://dx.doi.org/10.1093/sysbio/syac040 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of the Society of Systematic Biologists. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Regular Articles
Xiong, Haifeng
Wang, Danying
Shao, Chen
Yang, Xuchen
Yang, Jialin
Ma, Tao
Davis, Charles C
Liu, Liang
Xi, Zhenxiang
Species Tree Estimation and the Impact of Gene Loss Following Whole-Genome Duplication
title Species Tree Estimation and the Impact of Gene Loss Following Whole-Genome Duplication
title_full Species Tree Estimation and the Impact of Gene Loss Following Whole-Genome Duplication
title_fullStr Species Tree Estimation and the Impact of Gene Loss Following Whole-Genome Duplication
title_full_unstemmed Species Tree Estimation and the Impact of Gene Loss Following Whole-Genome Duplication
title_short Species Tree Estimation and the Impact of Gene Loss Following Whole-Genome Duplication
title_sort species tree estimation and the impact of gene loss following whole-genome duplication
topic Regular Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9558847/
https://www.ncbi.nlm.nih.gov/pubmed/35689633
http://dx.doi.org/10.1093/sysbio/syac040
work_keys_str_mv AT xionghaifeng speciestreeestimationandtheimpactofgenelossfollowingwholegenomeduplication
AT wangdanying speciestreeestimationandtheimpactofgenelossfollowingwholegenomeduplication
AT shaochen speciestreeestimationandtheimpactofgenelossfollowingwholegenomeduplication
AT yangxuchen speciestreeestimationandtheimpactofgenelossfollowingwholegenomeduplication
AT yangjialin speciestreeestimationandtheimpactofgenelossfollowingwholegenomeduplication
AT matao speciestreeestimationandtheimpactofgenelossfollowingwholegenomeduplication
AT davischarlesc speciestreeestimationandtheimpactofgenelossfollowingwholegenomeduplication
AT liuliang speciestreeestimationandtheimpactofgenelossfollowingwholegenomeduplication
AT xizhenxiang speciestreeestimationandtheimpactofgenelossfollowingwholegenomeduplication