Cargando…
GC content of plant genes is linked to past gene duplications
The frequency of G and C nucleotides in genomes varies from species to species, and sometimes even between different genes in the same genome. The monocot grasses have a bimodal distribution of genic GC content absent in dicots. We categorized plant genes from 5 dicots and 4 monocot grasses by synte...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8758071/ https://www.ncbi.nlm.nih.gov/pubmed/35025913 http://dx.doi.org/10.1371/journal.pone.0261748 |
_version_ | 1784632822794813440 |
---|---|
author | Bowers, John E. Tang, Haibao Burke, John M. Paterson, Andrew H. |
author_facet | Bowers, John E. Tang, Haibao Burke, John M. Paterson, Andrew H. |
author_sort | Bowers, John E. |
collection | PubMed |
description | The frequency of G and C nucleotides in genomes varies from species to species, and sometimes even between different genes in the same genome. The monocot grasses have a bimodal distribution of genic GC content absent in dicots. We categorized plant genes from 5 dicots and 4 monocot grasses by synteny to related species and determined that syntenic genes have significantly higher GC content than non-syntenic genes at their 5`-end in the third position within codons for all 9 species. Lower GC content is correlated with gene duplication, as lack of synteny to distantly related genomes is associated with past interspersed gene duplications. Two mutation types can account for biased GC content, mutation of methylated C to T and gene conversion from A to G. Gene conversion involves non-reciprocal exchanges between homologous alleles and is not detectable when the alleles are identical or heterozygous for presence-absence variation, both likely situations for genes duplicated to new loci. Gene duplication can cause production of siRNA which can induce targeted methylation, elevating mC→T mutations. Recently duplicated plant genes are more frequently methylated and less likely to undergo gene conversion, each of these factors synergistically creating a mutational environment favoring AT nucleotides. The syntenic genes with high GC content in the grasses compose a subset that have undergone few duplications, or for which duplicate copies were purged by selection. We propose a “biased gene duplication / biased mutation” (BDBM) model that may explain the origin and trajectory of the observed link between duplication and genic GC bias. The BDBM model is supported by empirical data based on joint analyses of 9 angiosperm species with their genes categorized by duplication status, GC content, methylation levels and functional classes. |
format | Online Article Text |
id | pubmed-8758071 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-87580712022-01-14 GC content of plant genes is linked to past gene duplications Bowers, John E. Tang, Haibao Burke, John M. Paterson, Andrew H. PLoS One Research Article The frequency of G and C nucleotides in genomes varies from species to species, and sometimes even between different genes in the same genome. The monocot grasses have a bimodal distribution of genic GC content absent in dicots. We categorized plant genes from 5 dicots and 4 monocot grasses by synteny to related species and determined that syntenic genes have significantly higher GC content than non-syntenic genes at their 5`-end in the third position within codons for all 9 species. Lower GC content is correlated with gene duplication, as lack of synteny to distantly related genomes is associated with past interspersed gene duplications. Two mutation types can account for biased GC content, mutation of methylated C to T and gene conversion from A to G. Gene conversion involves non-reciprocal exchanges between homologous alleles and is not detectable when the alleles are identical or heterozygous for presence-absence variation, both likely situations for genes duplicated to new loci. Gene duplication can cause production of siRNA which can induce targeted methylation, elevating mC→T mutations. Recently duplicated plant genes are more frequently methylated and less likely to undergo gene conversion, each of these factors synergistically creating a mutational environment favoring AT nucleotides. The syntenic genes with high GC content in the grasses compose a subset that have undergone few duplications, or for which duplicate copies were purged by selection. We propose a “biased gene duplication / biased mutation” (BDBM) model that may explain the origin and trajectory of the observed link between duplication and genic GC bias. The BDBM model is supported by empirical data based on joint analyses of 9 angiosperm species with their genes categorized by duplication status, GC content, methylation levels and functional classes. Public Library of Science 2022-01-13 /pmc/articles/PMC8758071/ /pubmed/35025913 http://dx.doi.org/10.1371/journal.pone.0261748 Text en © 2022 Bowers et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Bowers, John E. Tang, Haibao Burke, John M. Paterson, Andrew H. GC content of plant genes is linked to past gene duplications |
title | GC content of plant genes is linked to past gene duplications |
title_full | GC content of plant genes is linked to past gene duplications |
title_fullStr | GC content of plant genes is linked to past gene duplications |
title_full_unstemmed | GC content of plant genes is linked to past gene duplications |
title_short | GC content of plant genes is linked to past gene duplications |
title_sort | gc content of plant genes is linked to past gene duplications |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8758071/ https://www.ncbi.nlm.nih.gov/pubmed/35025913 http://dx.doi.org/10.1371/journal.pone.0261748 |
work_keys_str_mv | AT bowersjohne gccontentofplantgenesislinkedtopastgeneduplications AT tanghaibao gccontentofplantgenesislinkedtopastgeneduplications AT burkejohnm gccontentofplantgenesislinkedtopastgeneduplications AT patersonandrewh gccontentofplantgenesislinkedtopastgeneduplications |