Cargando…

Consistent Clustering Pattern of Prokaryotic Genes Based on Base Frequency at the Second Codon Position and its Association with Functional Category Preference

In 2002, our research group observed a gene clustering pattern based on the base frequency of A versus T at the second codon position in the genome of Vibrio cholera and found that the functional category distribution of genes in the two clusters was different. With the availability of a large numbe...

Descripción completa

Detalles Bibliográficos
Autores principales: Jin, Yan-Ting, Ma, Cong, Wang, Xin, Wang, Shu-Xuan, Zhang, Kai-Yue, Zheng, Wen-Xin, Deng, Zixin, Wang, Ju, Guo, Feng-Biao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer Nature Singapore 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9124167/
https://www.ncbi.nlm.nih.gov/pubmed/34817803
http://dx.doi.org/10.1007/s12539-021-00493-w
_version_ 1784711688041267200
author Jin, Yan-Ting
Ma, Cong
Wang, Xin
Wang, Shu-Xuan
Zhang, Kai-Yue
Zheng, Wen-Xin
Deng, Zixin
Wang, Ju
Guo, Feng-Biao
author_facet Jin, Yan-Ting
Ma, Cong
Wang, Xin
Wang, Shu-Xuan
Zhang, Kai-Yue
Zheng, Wen-Xin
Deng, Zixin
Wang, Ju
Guo, Feng-Biao
author_sort Jin, Yan-Ting
collection PubMed
description In 2002, our research group observed a gene clustering pattern based on the base frequency of A versus T at the second codon position in the genome of Vibrio cholera and found that the functional category distribution of genes in the two clusters was different. With the availability of a large number of sequenced genomes, we performed a systematic investigation of A(2)–T(2) distribution and found that 2694 out of 2764 prokaryotic genomes have an optimal clustering number of two, indicating a consistent pattern. Analysis of the functional categories of the coding genes in each cluster in 1483 prokaryotic genomes indicated, that 99.33% of the genomes exhibited a significant difference (p < 0.01) in function distribution between the two clusters(.) Specifically, functional category P was overrepresented in the small cluster of 98.65% of genomes, whereas categories J, K, and L were overrepresented in the larger cluster of over 98.52% of genomes. Lineage analysis uncovered that these preferences appear consistently across all phyla. Overall, our work revealed an almost universal clustering pattern based on the relative frequency of A(2) versus T(2) and its role in functional category preference. These findings will promote the understanding of the rationality of theoretical prediction of functional classes of genes from their nucleotide sequences and how protein function is determined by DNA sequence. GRAPHICAL ABSTRACT: [Image: see text] SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s12539-021-00493-w.
format Online
Article
Text
id pubmed-9124167
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Springer Nature Singapore
record_format MEDLINE/PubMed
spelling pubmed-91241672022-05-23 Consistent Clustering Pattern of Prokaryotic Genes Based on Base Frequency at the Second Codon Position and its Association with Functional Category Preference Jin, Yan-Ting Ma, Cong Wang, Xin Wang, Shu-Xuan Zhang, Kai-Yue Zheng, Wen-Xin Deng, Zixin Wang, Ju Guo, Feng-Biao Interdiscip Sci Original Research Article In 2002, our research group observed a gene clustering pattern based on the base frequency of A versus T at the second codon position in the genome of Vibrio cholera and found that the functional category distribution of genes in the two clusters was different. With the availability of a large number of sequenced genomes, we performed a systematic investigation of A(2)–T(2) distribution and found that 2694 out of 2764 prokaryotic genomes have an optimal clustering number of two, indicating a consistent pattern. Analysis of the functional categories of the coding genes in each cluster in 1483 prokaryotic genomes indicated, that 99.33% of the genomes exhibited a significant difference (p < 0.01) in function distribution between the two clusters(.) Specifically, functional category P was overrepresented in the small cluster of 98.65% of genomes, whereas categories J, K, and L were overrepresented in the larger cluster of over 98.52% of genomes. Lineage analysis uncovered that these preferences appear consistently across all phyla. Overall, our work revealed an almost universal clustering pattern based on the relative frequency of A(2) versus T(2) and its role in functional category preference. These findings will promote the understanding of the rationality of theoretical prediction of functional classes of genes from their nucleotide sequences and how protein function is determined by DNA sequence. GRAPHICAL ABSTRACT: [Image: see text] SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s12539-021-00493-w. Springer Nature Singapore 2021-11-24 2022 /pmc/articles/PMC9124167/ /pubmed/34817803 http://dx.doi.org/10.1007/s12539-021-00493-w Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Original Research Article
Jin, Yan-Ting
Ma, Cong
Wang, Xin
Wang, Shu-Xuan
Zhang, Kai-Yue
Zheng, Wen-Xin
Deng, Zixin
Wang, Ju
Guo, Feng-Biao
Consistent Clustering Pattern of Prokaryotic Genes Based on Base Frequency at the Second Codon Position and its Association with Functional Category Preference
title Consistent Clustering Pattern of Prokaryotic Genes Based on Base Frequency at the Second Codon Position and its Association with Functional Category Preference
title_full Consistent Clustering Pattern of Prokaryotic Genes Based on Base Frequency at the Second Codon Position and its Association with Functional Category Preference
title_fullStr Consistent Clustering Pattern of Prokaryotic Genes Based on Base Frequency at the Second Codon Position and its Association with Functional Category Preference
title_full_unstemmed Consistent Clustering Pattern of Prokaryotic Genes Based on Base Frequency at the Second Codon Position and its Association with Functional Category Preference
title_short Consistent Clustering Pattern of Prokaryotic Genes Based on Base Frequency at the Second Codon Position and its Association with Functional Category Preference
title_sort consistent clustering pattern of prokaryotic genes based on base frequency at the second codon position and its association with functional category preference
topic Original Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9124167/
https://www.ncbi.nlm.nih.gov/pubmed/34817803
http://dx.doi.org/10.1007/s12539-021-00493-w
work_keys_str_mv AT jinyanting consistentclusteringpatternofprokaryoticgenesbasedonbasefrequencyatthesecondcodonpositionanditsassociationwithfunctionalcategorypreference
AT macong consistentclusteringpatternofprokaryoticgenesbasedonbasefrequencyatthesecondcodonpositionanditsassociationwithfunctionalcategorypreference
AT wangxin consistentclusteringpatternofprokaryoticgenesbasedonbasefrequencyatthesecondcodonpositionanditsassociationwithfunctionalcategorypreference
AT wangshuxuan consistentclusteringpatternofprokaryoticgenesbasedonbasefrequencyatthesecondcodonpositionanditsassociationwithfunctionalcategorypreference
AT zhangkaiyue consistentclusteringpatternofprokaryoticgenesbasedonbasefrequencyatthesecondcodonpositionanditsassociationwithfunctionalcategorypreference
AT zhengwenxin consistentclusteringpatternofprokaryoticgenesbasedonbasefrequencyatthesecondcodonpositionanditsassociationwithfunctionalcategorypreference
AT dengzixin consistentclusteringpatternofprokaryoticgenesbasedonbasefrequencyatthesecondcodonpositionanditsassociationwithfunctionalcategorypreference
AT wangju consistentclusteringpatternofprokaryoticgenesbasedonbasefrequencyatthesecondcodonpositionanditsassociationwithfunctionalcategorypreference
AT guofengbiao consistentclusteringpatternofprokaryoticgenesbasedonbasefrequencyatthesecondcodonpositionanditsassociationwithfunctionalcategorypreference