Cargando…

Integrated synteny- and similarity-based inference on the polyploidization–fractionation cycle

Whole-genome doubling, tripling or replicating to a greater degree, due to fixation of polyploidization events, is attested in almost all lineages of the flowering plants, recurring in the ancestry of some plants two, three or more times in retracing their history to the earliest angiosperm. This ma...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Yue, Yu, Zhe, Zheng, Chunfang, Sankoff, David
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Royal Society 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8193467/
https://www.ncbi.nlm.nih.gov/pubmed/34123351
http://dx.doi.org/10.1098/rsfs.2020.0059
_version_ 1783706231625482240
author Zhang, Yue
Yu, Zhe
Zheng, Chunfang
Sankoff, David
author_facet Zhang, Yue
Yu, Zhe
Zheng, Chunfang
Sankoff, David
author_sort Zhang, Yue
collection PubMed
description Whole-genome doubling, tripling or replicating to a greater degree, due to fixation of polyploidization events, is attested in almost all lineages of the flowering plants, recurring in the ancestry of some plants two, three or more times in retracing their history to the earliest angiosperm. This major mechanism in plant genome evolution, which generally appears as instantaneous on the evolutionary time scale, sets in operation a compensatory process called fractionation, the loss of duplicate genes, initially rapid, but continuing at a diminishing rate over millions and tens of millions of years. We study this process by statistically comparing the distribution of duplicate gene pairs as a function of their time of creation through polyploidization, as measured by sequence similarity. The stochastic model that accounts for this distribution, though exceedingly simple, still has too many parameters to be estimated based only on the similarity distribution, while the computational procedures for compiling the distribution from annotated genomic data is heavily biased against earlier polyploidization events—syntenic ‘crumble’. Other parameters, such as the size of the initial gene complement and the ploidy of the various events giving rise to duplicate gene pairs, are even more inaccessible to estimation. Here, we show how the frequency of unpaired genes, identified via their embedding in stretches of duplicate pairs, together with previously established constraints among some parameters, adds enormously to the range of successive polyploidization events that can be analysed. This also allows us to estimate the initial gene complement and to correct for the bias due to crumble. We explore the applicability of our methodology to four flowering plant genomes covering a range of different polyploidization histories.
format Online
Article
Text
id pubmed-8193467
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher The Royal Society
record_format MEDLINE/PubMed
spelling pubmed-81934672022-02-02 Integrated synteny- and similarity-based inference on the polyploidization–fractionation cycle Zhang, Yue Yu, Zhe Zheng, Chunfang Sankoff, David Interface Focus Articles Whole-genome doubling, tripling or replicating to a greater degree, due to fixation of polyploidization events, is attested in almost all lineages of the flowering plants, recurring in the ancestry of some plants two, three or more times in retracing their history to the earliest angiosperm. This major mechanism in plant genome evolution, which generally appears as instantaneous on the evolutionary time scale, sets in operation a compensatory process called fractionation, the loss of duplicate genes, initially rapid, but continuing at a diminishing rate over millions and tens of millions of years. We study this process by statistically comparing the distribution of duplicate gene pairs as a function of their time of creation through polyploidization, as measured by sequence similarity. The stochastic model that accounts for this distribution, though exceedingly simple, still has too many parameters to be estimated based only on the similarity distribution, while the computational procedures for compiling the distribution from annotated genomic data is heavily biased against earlier polyploidization events—syntenic ‘crumble’. Other parameters, such as the size of the initial gene complement and the ploidy of the various events giving rise to duplicate gene pairs, are even more inaccessible to estimation. Here, we show how the frequency of unpaired genes, identified via their embedding in stretches of duplicate pairs, together with previously established constraints among some parameters, adds enormously to the range of successive polyploidization events that can be analysed. This also allows us to estimate the initial gene complement and to correct for the bias due to crumble. We explore the applicability of our methodology to four flowering plant genomes covering a range of different polyploidization histories. The Royal Society 2021-06-11 /pmc/articles/PMC8193467/ /pubmed/34123351 http://dx.doi.org/10.1098/rsfs.2020.0059 Text en © 2021 The Authors. https://creativecommons.org/licenses/by/4.0/Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, provided the original author and source are credited.
spellingShingle Articles
Zhang, Yue
Yu, Zhe
Zheng, Chunfang
Sankoff, David
Integrated synteny- and similarity-based inference on the polyploidization–fractionation cycle
title Integrated synteny- and similarity-based inference on the polyploidization–fractionation cycle
title_full Integrated synteny- and similarity-based inference on the polyploidization–fractionation cycle
title_fullStr Integrated synteny- and similarity-based inference on the polyploidization–fractionation cycle
title_full_unstemmed Integrated synteny- and similarity-based inference on the polyploidization–fractionation cycle
title_short Integrated synteny- and similarity-based inference on the polyploidization–fractionation cycle
title_sort integrated synteny- and similarity-based inference on the polyploidization–fractionation cycle
topic Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8193467/
https://www.ncbi.nlm.nih.gov/pubmed/34123351
http://dx.doi.org/10.1098/rsfs.2020.0059
work_keys_str_mv AT zhangyue integratedsyntenyandsimilaritybasedinferenceonthepolyploidizationfractionationcycle
AT yuzhe integratedsyntenyandsimilaritybasedinferenceonthepolyploidizationfractionationcycle
AT zhengchunfang integratedsyntenyandsimilaritybasedinferenceonthepolyploidizationfractionationcycle
AT sankoffdavid integratedsyntenyandsimilaritybasedinferenceonthepolyploidizationfractionationcycle