Cargando…

Local Renyi entropic profiles of DNA sequences

BACKGROUND: In a recent report the authors presented a new measure of continuous entropy for DNA sequences, which allows the estimation of their randomness level. The definition therein explored was based on the Rényi entropy of probability density estimation (pdf) using the Parzen's window met...

Descripción completa

Detalles Bibliográficos
Autores principales: Vinga, Susana, Almeida, Jonas S
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2238722/
https://www.ncbi.nlm.nih.gov/pubmed/17939871
http://dx.doi.org/10.1186/1471-2105-8-393
_version_ 1782150449901600768
author Vinga, Susana
Almeida, Jonas S
author_facet Vinga, Susana
Almeida, Jonas S
author_sort Vinga, Susana
collection PubMed
description BACKGROUND: In a recent report the authors presented a new measure of continuous entropy for DNA sequences, which allows the estimation of their randomness level. The definition therein explored was based on the Rényi entropy of probability density estimation (pdf) using the Parzen's window method and applied to Chaos Game Representation/Universal Sequence Maps (CGR/USM). Subsequent work proposed a fractal pdf kernel as a more exact solution for the iterated map representation. This report extends the concepts of continuous entropy by defining DNA sequence entropic profiles using the new pdf estimations to refine the density estimation of motifs. RESULTS: The new methodology enables two results. On the one hand it shows that the entropic profiles are directly related with the statistical significance of motifs, allowing the study of under and over-representation of segments. On the other hand, by spanning the parameters of the kernel function it is possible to extract important information about the scale of each conserved DNA region. The computational applications, developed in Matlab m-code, the corresponding binary executables and additional material and examples are made publicly available at . CONCLUSION: The ability to detect local conservation from a scale-independent representation of symbolic sequences is particularly relevant for biological applications where conserved motifs occur in multiple, overlapping scales, with significant future applications in the recognition of foreign genomic material and inference of motif structures.
format Text
id pubmed-2238722
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-22387222008-02-12 Local Renyi entropic profiles of DNA sequences Vinga, Susana Almeida, Jonas S BMC Bioinformatics Research Article BACKGROUND: In a recent report the authors presented a new measure of continuous entropy for DNA sequences, which allows the estimation of their randomness level. The definition therein explored was based on the Rényi entropy of probability density estimation (pdf) using the Parzen's window method and applied to Chaos Game Representation/Universal Sequence Maps (CGR/USM). Subsequent work proposed a fractal pdf kernel as a more exact solution for the iterated map representation. This report extends the concepts of continuous entropy by defining DNA sequence entropic profiles using the new pdf estimations to refine the density estimation of motifs. RESULTS: The new methodology enables two results. On the one hand it shows that the entropic profiles are directly related with the statistical significance of motifs, allowing the study of under and over-representation of segments. On the other hand, by spanning the parameters of the kernel function it is possible to extract important information about the scale of each conserved DNA region. The computational applications, developed in Matlab m-code, the corresponding binary executables and additional material and examples are made publicly available at . CONCLUSION: The ability to detect local conservation from a scale-independent representation of symbolic sequences is particularly relevant for biological applications where conserved motifs occur in multiple, overlapping scales, with significant future applications in the recognition of foreign genomic material and inference of motif structures. BioMed Central 2007-10-16 /pmc/articles/PMC2238722/ /pubmed/17939871 http://dx.doi.org/10.1186/1471-2105-8-393 Text en Copyright © 2007 Vinga and Almeida; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Vinga, Susana
Almeida, Jonas S
Local Renyi entropic profiles of DNA sequences
title Local Renyi entropic profiles of DNA sequences
title_full Local Renyi entropic profiles of DNA sequences
title_fullStr Local Renyi entropic profiles of DNA sequences
title_full_unstemmed Local Renyi entropic profiles of DNA sequences
title_short Local Renyi entropic profiles of DNA sequences
title_sort local renyi entropic profiles of dna sequences
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2238722/
https://www.ncbi.nlm.nih.gov/pubmed/17939871
http://dx.doi.org/10.1186/1471-2105-8-393
work_keys_str_mv AT vingasusana localrenyientropicprofilesofdnasequences
AT almeidajonass localrenyientropicprofilesofdnasequences