Cargando…

Information theoretic perspective on genome clustering

Shannon’s information theoretic perspective of communication helps one to understand the storage and processing of information in one-dimensional sequences. An information theoretic analysis of 937 available completely sequenced prokaryotic genomes and 238 eukaryotic chromosomes is presented. Inform...

Descripción completa

Detalles Bibliográficos
Autores principales: Veluchamy, Alaguraj, Mehta, Preeti, Srividhya, K.V., Vikram, Hirendra, Govind, M.K., Gupta, Ramneek, Aziz Bin Dukhyil, Abdul, Abdullah Alharbi, Raed, Abdullah Aloyuni, Saleh, Hassan, Mohamed M., Krishnaswamy, S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7938122/
https://www.ncbi.nlm.nih.gov/pubmed/33732074
http://dx.doi.org/10.1016/j.sjbs.2020.12.039
_version_ 1783661538071019520
author Veluchamy, Alaguraj
Mehta, Preeti
Srividhya, K.V.
Vikram, Hirendra
Govind, M.K.
Gupta, Ramneek
Aziz Bin Dukhyil, Abdul
Abdullah Alharbi, Raed
Abdullah Aloyuni, Saleh
Hassan, Mohamed M.
Krishnaswamy, S.
author_facet Veluchamy, Alaguraj
Mehta, Preeti
Srividhya, K.V.
Vikram, Hirendra
Govind, M.K.
Gupta, Ramneek
Aziz Bin Dukhyil, Abdul
Abdullah Alharbi, Raed
Abdullah Aloyuni, Saleh
Hassan, Mohamed M.
Krishnaswamy, S.
author_sort Veluchamy, Alaguraj
collection PubMed
description Shannon’s information theoretic perspective of communication helps one to understand the storage and processing of information in one-dimensional sequences. An information theoretic analysis of 937 available completely sequenced prokaryotic genomes and 238 eukaryotic chromosomes is presented. Information content (Id) values were used to cluster these chromosomes. Chargaff’s second parity rule i.e compositional self-complementarity, an empirical fact is observed in all the genomes, except for the proteobacteria Candidatus Hodgkinia cicadicola. High information content, arising out of biased base composition in all the 14 chromosomes of Plasmodium falciparum is found among two other genomes of prokaryotes viz. Buchnera aphidicola str. Cc (Cinara cedri) and Candidatus Carsonella ruddii PV. Despite size and compositional variations, both prokaryotic and eukaryotic genomes do not deviate significantly from an equiprobable and random situation. Eukaryotic chromosomes of an organism tend to have similar informational restraints as seen when a simple distance based method is used to cluster them. In eukaryotes, in certain cases, Id values are also similar for the two arms (p and q arm) of the chromosomes. The results of this current study confirm that the information content can provide insights into the clustering of genomes and the evolution of messaging strategies of the genomes. An efficient and robust Perl CGI standalone tool is created based on this information theory algorithm for the analysis of the whole genomes and is made available at https://github.com/AlagurajVeluchamy/InformationTheory.
format Online
Article
Text
id pubmed-7938122
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-79381222021-03-16 Information theoretic perspective on genome clustering Veluchamy, Alaguraj Mehta, Preeti Srividhya, K.V. Vikram, Hirendra Govind, M.K. Gupta, Ramneek Aziz Bin Dukhyil, Abdul Abdullah Alharbi, Raed Abdullah Aloyuni, Saleh Hassan, Mohamed M. Krishnaswamy, S. Saudi J Biol Sci Original Article Shannon’s information theoretic perspective of communication helps one to understand the storage and processing of information in one-dimensional sequences. An information theoretic analysis of 937 available completely sequenced prokaryotic genomes and 238 eukaryotic chromosomes is presented. Information content (Id) values were used to cluster these chromosomes. Chargaff’s second parity rule i.e compositional self-complementarity, an empirical fact is observed in all the genomes, except for the proteobacteria Candidatus Hodgkinia cicadicola. High information content, arising out of biased base composition in all the 14 chromosomes of Plasmodium falciparum is found among two other genomes of prokaryotes viz. Buchnera aphidicola str. Cc (Cinara cedri) and Candidatus Carsonella ruddii PV. Despite size and compositional variations, both prokaryotic and eukaryotic genomes do not deviate significantly from an equiprobable and random situation. Eukaryotic chromosomes of an organism tend to have similar informational restraints as seen when a simple distance based method is used to cluster them. In eukaryotes, in certain cases, Id values are also similar for the two arms (p and q arm) of the chromosomes. The results of this current study confirm that the information content can provide insights into the clustering of genomes and the evolution of messaging strategies of the genomes. An efficient and robust Perl CGI standalone tool is created based on this information theory algorithm for the analysis of the whole genomes and is made available at https://github.com/AlagurajVeluchamy/InformationTheory. Elsevier 2021-03 2020-12-31 /pmc/articles/PMC7938122/ /pubmed/33732074 http://dx.doi.org/10.1016/j.sjbs.2020.12.039 Text en © 2020 The Author(s) http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Original Article
Veluchamy, Alaguraj
Mehta, Preeti
Srividhya, K.V.
Vikram, Hirendra
Govind, M.K.
Gupta, Ramneek
Aziz Bin Dukhyil, Abdul
Abdullah Alharbi, Raed
Abdullah Aloyuni, Saleh
Hassan, Mohamed M.
Krishnaswamy, S.
Information theoretic perspective on genome clustering
title Information theoretic perspective on genome clustering
title_full Information theoretic perspective on genome clustering
title_fullStr Information theoretic perspective on genome clustering
title_full_unstemmed Information theoretic perspective on genome clustering
title_short Information theoretic perspective on genome clustering
title_sort information theoretic perspective on genome clustering
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7938122/
https://www.ncbi.nlm.nih.gov/pubmed/33732074
http://dx.doi.org/10.1016/j.sjbs.2020.12.039
work_keys_str_mv AT veluchamyalaguraj informationtheoreticperspectiveongenomeclustering
AT mehtapreeti informationtheoreticperspectiveongenomeclustering
AT srividhyakv informationtheoreticperspectiveongenomeclustering
AT vikramhirendra informationtheoreticperspectiveongenomeclustering
AT govindmk informationtheoreticperspectiveongenomeclustering
AT guptaramneek informationtheoreticperspectiveongenomeclustering
AT azizbindukhyilabdul informationtheoreticperspectiveongenomeclustering
AT abdullahalharbiraed informationtheoreticperspectiveongenomeclustering
AT abdullahaloyunisaleh informationtheoreticperspectiveongenomeclustering
AT hassanmohamedm informationtheoreticperspectiveongenomeclustering
AT krishnaswamys informationtheoreticperspectiveongenomeclustering