Cargando…

GC content of plant genes is linked to past gene duplications

The frequency of G and C nucleotides in genomes varies from species to species, and sometimes even between different genes in the same genome. The monocot grasses have a bimodal distribution of genic GC content absent in dicots. We categorized plant genes from 5 dicots and 4 monocot grasses by synte...

Descripción completa

Detalles Bibliográficos
Autores principales: Bowers, John E., Tang, Haibao, Burke, John M., Paterson, Andrew H.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8758071/
https://www.ncbi.nlm.nih.gov/pubmed/35025913
http://dx.doi.org/10.1371/journal.pone.0261748
_version_ 1784632822794813440
author Bowers, John E.
Tang, Haibao
Burke, John M.
Paterson, Andrew H.
author_facet Bowers, John E.
Tang, Haibao
Burke, John M.
Paterson, Andrew H.
author_sort Bowers, John E.
collection PubMed
description The frequency of G and C nucleotides in genomes varies from species to species, and sometimes even between different genes in the same genome. The monocot grasses have a bimodal distribution of genic GC content absent in dicots. We categorized plant genes from 5 dicots and 4 monocot grasses by synteny to related species and determined that syntenic genes have significantly higher GC content than non-syntenic genes at their 5`-end in the third position within codons for all 9 species. Lower GC content is correlated with gene duplication, as lack of synteny to distantly related genomes is associated with past interspersed gene duplications. Two mutation types can account for biased GC content, mutation of methylated C to T and gene conversion from A to G. Gene conversion involves non-reciprocal exchanges between homologous alleles and is not detectable when the alleles are identical or heterozygous for presence-absence variation, both likely situations for genes duplicated to new loci. Gene duplication can cause production of siRNA which can induce targeted methylation, elevating mC→T mutations. Recently duplicated plant genes are more frequently methylated and less likely to undergo gene conversion, each of these factors synergistically creating a mutational environment favoring AT nucleotides. The syntenic genes with high GC content in the grasses compose a subset that have undergone few duplications, or for which duplicate copies were purged by selection. We propose a “biased gene duplication / biased mutation” (BDBM) model that may explain the origin and trajectory of the observed link between duplication and genic GC bias. The BDBM model is supported by empirical data based on joint analyses of 9 angiosperm species with their genes categorized by duplication status, GC content, methylation levels and functional classes.
format Online
Article
Text
id pubmed-8758071
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-87580712022-01-14 GC content of plant genes is linked to past gene duplications Bowers, John E. Tang, Haibao Burke, John M. Paterson, Andrew H. PLoS One Research Article The frequency of G and C nucleotides in genomes varies from species to species, and sometimes even between different genes in the same genome. The monocot grasses have a bimodal distribution of genic GC content absent in dicots. We categorized plant genes from 5 dicots and 4 monocot grasses by synteny to related species and determined that syntenic genes have significantly higher GC content than non-syntenic genes at their 5`-end in the third position within codons for all 9 species. Lower GC content is correlated with gene duplication, as lack of synteny to distantly related genomes is associated with past interspersed gene duplications. Two mutation types can account for biased GC content, mutation of methylated C to T and gene conversion from A to G. Gene conversion involves non-reciprocal exchanges between homologous alleles and is not detectable when the alleles are identical or heterozygous for presence-absence variation, both likely situations for genes duplicated to new loci. Gene duplication can cause production of siRNA which can induce targeted methylation, elevating mC→T mutations. Recently duplicated plant genes are more frequently methylated and less likely to undergo gene conversion, each of these factors synergistically creating a mutational environment favoring AT nucleotides. The syntenic genes with high GC content in the grasses compose a subset that have undergone few duplications, or for which duplicate copies were purged by selection. We propose a “biased gene duplication / biased mutation” (BDBM) model that may explain the origin and trajectory of the observed link between duplication and genic GC bias. The BDBM model is supported by empirical data based on joint analyses of 9 angiosperm species with their genes categorized by duplication status, GC content, methylation levels and functional classes. Public Library of Science 2022-01-13 /pmc/articles/PMC8758071/ /pubmed/35025913 http://dx.doi.org/10.1371/journal.pone.0261748 Text en © 2022 Bowers et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Bowers, John E.
Tang, Haibao
Burke, John M.
Paterson, Andrew H.
GC content of plant genes is linked to past gene duplications
title GC content of plant genes is linked to past gene duplications
title_full GC content of plant genes is linked to past gene duplications
title_fullStr GC content of plant genes is linked to past gene duplications
title_full_unstemmed GC content of plant genes is linked to past gene duplications
title_short GC content of plant genes is linked to past gene duplications
title_sort gc content of plant genes is linked to past gene duplications
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8758071/
https://www.ncbi.nlm.nih.gov/pubmed/35025913
http://dx.doi.org/10.1371/journal.pone.0261748
work_keys_str_mv AT bowersjohne gccontentofplantgenesislinkedtopastgeneduplications
AT tanghaibao gccontentofplantgenesislinkedtopastgeneduplications
AT burkejohnm gccontentofplantgenesislinkedtopastgeneduplications
AT patersonandrewh gccontentofplantgenesislinkedtopastgeneduplications