Cargando…

Entropic Profiler – detection of conservation in genomes using information theory

BACKGROUND: In the last decades, with the successive availability of whole genome sequences, many research efforts have been made to mathematically model DNA. Entropic Profiles (EP) were proposed recently as a new measure of continuous entropy of genome sequences. EP represent local information plot...

Descripción completa

Detalles Bibliográficos
Autores principales: Fernandes, Francisco, Freitas, Ana T, Almeida, Jonas S, Vinga, Susana
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2686720/
https://www.ncbi.nlm.nih.gov/pubmed/19416538
http://dx.doi.org/10.1186/1756-0500-2-72
_version_ 1782167465906667520
author Fernandes, Francisco
Freitas, Ana T
Almeida, Jonas S
Vinga, Susana
author_facet Fernandes, Francisco
Freitas, Ana T
Almeida, Jonas S
Vinga, Susana
author_sort Fernandes, Francisco
collection PubMed
description BACKGROUND: In the last decades, with the successive availability of whole genome sequences, many research efforts have been made to mathematically model DNA. Entropic Profiles (EP) were proposed recently as a new measure of continuous entropy of genome sequences. EP represent local information plots related to DNA randomness and are based on information theory and statistical concepts. They express the weighed relative abundance of motifs for each position in genomes. Their study is very relevant because under or over-representation segments are often associated with significant biological meaning. FINDINGS: The Entropic Profiler application here presented is a new tool designed to detect and extract under and over-represented DNA segments in genomes by using EP. It allows its computation in a very efficient way by recurring to improved algorithms and data structures, which include modified suffix trees. Available through a web interface and as downloadable source code, it allows to study positions and to search for motifs inside the whole sequence or within a specified range. DNA sequences can be entered from different sources, including FASTA files, pre-loaded examples or resuming a previously saved work. Besides the EP value plots, p-values and z-scores for each motif are also computed, along with the Chaos Game Representation of the sequence. CONCLUSION: EP are directly related with the statistical significance of motifs and can be considered as a new method to extract and classify significant regions in genomes and estimate local scales in DNA. The present implementation establishes an efficient and useful tool for whole genome analysis.
format Text
id pubmed-2686720
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26867202009-05-27 Entropic Profiler – detection of conservation in genomes using information theory Fernandes, Francisco Freitas, Ana T Almeida, Jonas S Vinga, Susana BMC Res Notes Technical Note BACKGROUND: In the last decades, with the successive availability of whole genome sequences, many research efforts have been made to mathematically model DNA. Entropic Profiles (EP) were proposed recently as a new measure of continuous entropy of genome sequences. EP represent local information plots related to DNA randomness and are based on information theory and statistical concepts. They express the weighed relative abundance of motifs for each position in genomes. Their study is very relevant because under or over-representation segments are often associated with significant biological meaning. FINDINGS: The Entropic Profiler application here presented is a new tool designed to detect and extract under and over-represented DNA segments in genomes by using EP. It allows its computation in a very efficient way by recurring to improved algorithms and data structures, which include modified suffix trees. Available through a web interface and as downloadable source code, it allows to study positions and to search for motifs inside the whole sequence or within a specified range. DNA sequences can be entered from different sources, including FASTA files, pre-loaded examples or resuming a previously saved work. Besides the EP value plots, p-values and z-scores for each motif are also computed, along with the Chaos Game Representation of the sequence. CONCLUSION: EP are directly related with the statistical significance of motifs and can be considered as a new method to extract and classify significant regions in genomes and estimate local scales in DNA. The present implementation establishes an efficient and useful tool for whole genome analysis. BioMed Central 2009-05-05 /pmc/articles/PMC2686720/ /pubmed/19416538 http://dx.doi.org/10.1186/1756-0500-2-72 Text en Copyright © 2009 Vinga et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Technical Note
Fernandes, Francisco
Freitas, Ana T
Almeida, Jonas S
Vinga, Susana
Entropic Profiler – detection of conservation in genomes using information theory
title Entropic Profiler – detection of conservation in genomes using information theory
title_full Entropic Profiler – detection of conservation in genomes using information theory
title_fullStr Entropic Profiler – detection of conservation in genomes using information theory
title_full_unstemmed Entropic Profiler – detection of conservation in genomes using information theory
title_short Entropic Profiler – detection of conservation in genomes using information theory
title_sort entropic profiler – detection of conservation in genomes using information theory
topic Technical Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2686720/
https://www.ncbi.nlm.nih.gov/pubmed/19416538
http://dx.doi.org/10.1186/1756-0500-2-72
work_keys_str_mv AT fernandesfrancisco entropicprofilerdetectionofconservationingenomesusinginformationtheory
AT freitasanat entropicprofilerdetectionofconservationingenomesusinginformationtheory
AT almeidajonass entropicprofilerdetectionofconservationingenomesusinginformationtheory
AT vingasusana entropicprofilerdetectionofconservationingenomesusinginformationtheory