Cargando…
Entropic Profiler – detection of conservation in genomes using information theory
BACKGROUND: In the last decades, with the successive availability of whole genome sequences, many research efforts have been made to mathematically model DNA. Entropic Profiles (EP) were proposed recently as a new measure of continuous entropy of genome sequences. EP represent local information plot...
Autores principales: | , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2009
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2686720/ https://www.ncbi.nlm.nih.gov/pubmed/19416538 http://dx.doi.org/10.1186/1756-0500-2-72 |
_version_ | 1782167465906667520 |
---|---|
author | Fernandes, Francisco Freitas, Ana T Almeida, Jonas S Vinga, Susana |
author_facet | Fernandes, Francisco Freitas, Ana T Almeida, Jonas S Vinga, Susana |
author_sort | Fernandes, Francisco |
collection | PubMed |
description | BACKGROUND: In the last decades, with the successive availability of whole genome sequences, many research efforts have been made to mathematically model DNA. Entropic Profiles (EP) were proposed recently as a new measure of continuous entropy of genome sequences. EP represent local information plots related to DNA randomness and are based on information theory and statistical concepts. They express the weighed relative abundance of motifs for each position in genomes. Their study is very relevant because under or over-representation segments are often associated with significant biological meaning. FINDINGS: The Entropic Profiler application here presented is a new tool designed to detect and extract under and over-represented DNA segments in genomes by using EP. It allows its computation in a very efficient way by recurring to improved algorithms and data structures, which include modified suffix trees. Available through a web interface and as downloadable source code, it allows to study positions and to search for motifs inside the whole sequence or within a specified range. DNA sequences can be entered from different sources, including FASTA files, pre-loaded examples or resuming a previously saved work. Besides the EP value plots, p-values and z-scores for each motif are also computed, along with the Chaos Game Representation of the sequence. CONCLUSION: EP are directly related with the statistical significance of motifs and can be considered as a new method to extract and classify significant regions in genomes and estimate local scales in DNA. The present implementation establishes an efficient and useful tool for whole genome analysis. |
format | Text |
id | pubmed-2686720 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2009 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-26867202009-05-27 Entropic Profiler – detection of conservation in genomes using information theory Fernandes, Francisco Freitas, Ana T Almeida, Jonas S Vinga, Susana BMC Res Notes Technical Note BACKGROUND: In the last decades, with the successive availability of whole genome sequences, many research efforts have been made to mathematically model DNA. Entropic Profiles (EP) were proposed recently as a new measure of continuous entropy of genome sequences. EP represent local information plots related to DNA randomness and are based on information theory and statistical concepts. They express the weighed relative abundance of motifs for each position in genomes. Their study is very relevant because under or over-representation segments are often associated with significant biological meaning. FINDINGS: The Entropic Profiler application here presented is a new tool designed to detect and extract under and over-represented DNA segments in genomes by using EP. It allows its computation in a very efficient way by recurring to improved algorithms and data structures, which include modified suffix trees. Available through a web interface and as downloadable source code, it allows to study positions and to search for motifs inside the whole sequence or within a specified range. DNA sequences can be entered from different sources, including FASTA files, pre-loaded examples or resuming a previously saved work. Besides the EP value plots, p-values and z-scores for each motif are also computed, along with the Chaos Game Representation of the sequence. CONCLUSION: EP are directly related with the statistical significance of motifs and can be considered as a new method to extract and classify significant regions in genomes and estimate local scales in DNA. The present implementation establishes an efficient and useful tool for whole genome analysis. BioMed Central 2009-05-05 /pmc/articles/PMC2686720/ /pubmed/19416538 http://dx.doi.org/10.1186/1756-0500-2-72 Text en Copyright © 2009 Vinga et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Technical Note Fernandes, Francisco Freitas, Ana T Almeida, Jonas S Vinga, Susana Entropic Profiler – detection of conservation in genomes using information theory |
title | Entropic Profiler – detection of conservation in genomes using information theory |
title_full | Entropic Profiler – detection of conservation in genomes using information theory |
title_fullStr | Entropic Profiler – detection of conservation in genomes using information theory |
title_full_unstemmed | Entropic Profiler – detection of conservation in genomes using information theory |
title_short | Entropic Profiler – detection of conservation in genomes using information theory |
title_sort | entropic profiler – detection of conservation in genomes using information theory |
topic | Technical Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2686720/ https://www.ncbi.nlm.nih.gov/pubmed/19416538 http://dx.doi.org/10.1186/1756-0500-2-72 |
work_keys_str_mv | AT fernandesfrancisco entropicprofilerdetectionofconservationingenomesusinginformationtheory AT freitasanat entropicprofilerdetectionofconservationingenomesusinginformationtheory AT almeidajonass entropicprofilerdetectionofconservationingenomesusinginformationtheory AT vingasusana entropicprofilerdetectionofconservationingenomesusinginformationtheory |