Cargando…

Distinguishing Functional DNA Words; A Method for Measuring Clustering Levels

Functional DNA sub-sequences and genome elements are spatially clustered through the genome just as keywords in literary texts. Therefore, some of the methods for ranking words in texts can also be used to compare different DNA sub-sequences. In analogy with the literary texts, here we claim that th...

Descripción completa

Detalles Bibliográficos
Autores principales: Moghaddasi, Hanieh, Khalifeh, Khosrow, Darooneh, Amir Hossein
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5269680/
https://www.ncbi.nlm.nih.gov/pubmed/28128320
http://dx.doi.org/10.1038/srep41543
_version_ 1782501040155787264
author Moghaddasi, Hanieh
Khalifeh, Khosrow
Darooneh, Amir Hossein
author_facet Moghaddasi, Hanieh
Khalifeh, Khosrow
Darooneh, Amir Hossein
author_sort Moghaddasi, Hanieh
collection PubMed
description Functional DNA sub-sequences and genome elements are spatially clustered through the genome just as keywords in literary texts. Therefore, some of the methods for ranking words in texts can also be used to compare different DNA sub-sequences. In analogy with the literary texts, here we claim that the distribution of distances between the successive sub-sequences (words) is q-exponential which is the distribution function in non-extensive statistical mechanics. Thus the q-parameter can be used as a measure of words clustering levels. Here, we analyzed the distribution of distances between consecutive occurrences of 16 possible dinucleotides in human chromosomes to obtain their corresponding q-parameters. We found that CG as a biologically important two-letter word concerning its methylation, has the highest clustering level. This finding shows the predicting ability of the method in biology. We also proposed that chromosome 18 with the largest value of q-parameter for promoters of genes is more sensitive to dietary and lifestyle. We extended our study to compare the genome of some selected organisms and concluded that the clustering level of CGs increases in higher evolutionary organisms compared to lower ones.
format Online
Article
Text
id pubmed-5269680
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-52696802017-02-01 Distinguishing Functional DNA Words; A Method for Measuring Clustering Levels Moghaddasi, Hanieh Khalifeh, Khosrow Darooneh, Amir Hossein Sci Rep Article Functional DNA sub-sequences and genome elements are spatially clustered through the genome just as keywords in literary texts. Therefore, some of the methods for ranking words in texts can also be used to compare different DNA sub-sequences. In analogy with the literary texts, here we claim that the distribution of distances between the successive sub-sequences (words) is q-exponential which is the distribution function in non-extensive statistical mechanics. Thus the q-parameter can be used as a measure of words clustering levels. Here, we analyzed the distribution of distances between consecutive occurrences of 16 possible dinucleotides in human chromosomes to obtain their corresponding q-parameters. We found that CG as a biologically important two-letter word concerning its methylation, has the highest clustering level. This finding shows the predicting ability of the method in biology. We also proposed that chromosome 18 with the largest value of q-parameter for promoters of genes is more sensitive to dietary and lifestyle. We extended our study to compare the genome of some selected organisms and concluded that the clustering level of CGs increases in higher evolutionary organisms compared to lower ones. Nature Publishing Group 2017-01-27 /pmc/articles/PMC5269680/ /pubmed/28128320 http://dx.doi.org/10.1038/srep41543 Text en Copyright © 2017, The Author(s) http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle Article
Moghaddasi, Hanieh
Khalifeh, Khosrow
Darooneh, Amir Hossein
Distinguishing Functional DNA Words; A Method for Measuring Clustering Levels
title Distinguishing Functional DNA Words; A Method for Measuring Clustering Levels
title_full Distinguishing Functional DNA Words; A Method for Measuring Clustering Levels
title_fullStr Distinguishing Functional DNA Words; A Method for Measuring Clustering Levels
title_full_unstemmed Distinguishing Functional DNA Words; A Method for Measuring Clustering Levels
title_short Distinguishing Functional DNA Words; A Method for Measuring Clustering Levels
title_sort distinguishing functional dna words; a method for measuring clustering levels
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5269680/
https://www.ncbi.nlm.nih.gov/pubmed/28128320
http://dx.doi.org/10.1038/srep41543
work_keys_str_mv AT moghaddasihanieh distinguishingfunctionaldnawordsamethodformeasuringclusteringlevels
AT khalifehkhosrow distinguishingfunctionaldnawordsamethodformeasuringclusteringlevels
AT daroonehamirhossein distinguishingfunctionaldnawordsamethodformeasuringclusteringlevels