Cargando…

An analysis of the positional distribution of DNA motifs in promoter regions and its biological relevance

BACKGROUND: Motif finding algorithms have developed in their ability to use computationally efficient methods to detect patterns in biological sequences. However the posterior classification of the output still suffers from some limitations, which makes it difficult to assess the biological signific...

Descripción completa

Detalles Bibliográficos
Autores principales: Casimiro, Ana C, Vinga, Susana, Freitas, Ana T, Oliveira, Arlindo L
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2375121/
https://www.ncbi.nlm.nih.gov/pubmed/18257925
http://dx.doi.org/10.1186/1471-2105-9-89
_version_ 1782154582301868032
author Casimiro, Ana C
Vinga, Susana
Freitas, Ana T
Oliveira, Arlindo L
author_facet Casimiro, Ana C
Vinga, Susana
Freitas, Ana T
Oliveira, Arlindo L
author_sort Casimiro, Ana C
collection PubMed
description BACKGROUND: Motif finding algorithms have developed in their ability to use computationally efficient methods to detect patterns in biological sequences. However the posterior classification of the output still suffers from some limitations, which makes it difficult to assess the biological significance of the motifs found. Previous work has highlighted the existence of positional bias of motifs in the DNA sequences, which might indicate not only that the pattern is important, but also provide hints of the positions where these patterns occur preferentially. RESULTS: We propose to integrate position uniformity tests and over-representation tests to improve the accuracy of the classification of motifs. Using artificial data, we have compared three different statistical tests (Chi-Square, Kolmogorov-Smirnov and a Chi-Square bootstrap) to assess whether a given motif occurs uniformly in the promoter region of a gene. Using the test that performed better in this dataset, we proceeded to study the positional distribution of several well known cis-regulatory elements, in the promoter sequences of different organisms (S. cerevisiae, H. sapiens, D. melanogaster, E. coli and several Dicotyledons plants). The results show that position conservation is relevant for the transcriptional machinery. CONCLUSION: We conclude that many biologically relevant motifs appear heterogeneously distributed in the promoter region of genes, and therefore, that non-uniformity is a good indicator of biological relevance and can be used to complement over-representation tests commonly used. In this article we present the results obtained for the S. cerevisiae data sets.
format Text
id pubmed-2375121
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-23751212008-05-12 An analysis of the positional distribution of DNA motifs in promoter regions and its biological relevance Casimiro, Ana C Vinga, Susana Freitas, Ana T Oliveira, Arlindo L BMC Bioinformatics Research Article BACKGROUND: Motif finding algorithms have developed in their ability to use computationally efficient methods to detect patterns in biological sequences. However the posterior classification of the output still suffers from some limitations, which makes it difficult to assess the biological significance of the motifs found. Previous work has highlighted the existence of positional bias of motifs in the DNA sequences, which might indicate not only that the pattern is important, but also provide hints of the positions where these patterns occur preferentially. RESULTS: We propose to integrate position uniformity tests and over-representation tests to improve the accuracy of the classification of motifs. Using artificial data, we have compared three different statistical tests (Chi-Square, Kolmogorov-Smirnov and a Chi-Square bootstrap) to assess whether a given motif occurs uniformly in the promoter region of a gene. Using the test that performed better in this dataset, we proceeded to study the positional distribution of several well known cis-regulatory elements, in the promoter sequences of different organisms (S. cerevisiae, H. sapiens, D. melanogaster, E. coli and several Dicotyledons plants). The results show that position conservation is relevant for the transcriptional machinery. CONCLUSION: We conclude that many biologically relevant motifs appear heterogeneously distributed in the promoter region of genes, and therefore, that non-uniformity is a good indicator of biological relevance and can be used to complement over-representation tests commonly used. In this article we present the results obtained for the S. cerevisiae data sets. BioMed Central 2008-02-07 /pmc/articles/PMC2375121/ /pubmed/18257925 http://dx.doi.org/10.1186/1471-2105-9-89 Text en Copyright © 2008 Casimiro et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Casimiro, Ana C
Vinga, Susana
Freitas, Ana T
Oliveira, Arlindo L
An analysis of the positional distribution of DNA motifs in promoter regions and its biological relevance
title An analysis of the positional distribution of DNA motifs in promoter regions and its biological relevance
title_full An analysis of the positional distribution of DNA motifs in promoter regions and its biological relevance
title_fullStr An analysis of the positional distribution of DNA motifs in promoter regions and its biological relevance
title_full_unstemmed An analysis of the positional distribution of DNA motifs in promoter regions and its biological relevance
title_short An analysis of the positional distribution of DNA motifs in promoter regions and its biological relevance
title_sort analysis of the positional distribution of dna motifs in promoter regions and its biological relevance
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2375121/
https://www.ncbi.nlm.nih.gov/pubmed/18257925
http://dx.doi.org/10.1186/1471-2105-9-89
work_keys_str_mv AT casimiroanac ananalysisofthepositionaldistributionofdnamotifsinpromoterregionsanditsbiologicalrelevance
AT vingasusana ananalysisofthepositionaldistributionofdnamotifsinpromoterregionsanditsbiologicalrelevance
AT freitasanat ananalysisofthepositionaldistributionofdnamotifsinpromoterregionsanditsbiologicalrelevance
AT oliveiraarlindol ananalysisofthepositionaldistributionofdnamotifsinpromoterregionsanditsbiologicalrelevance
AT casimiroanac analysisofthepositionaldistributionofdnamotifsinpromoterregionsanditsbiologicalrelevance
AT vingasusana analysisofthepositionaldistributionofdnamotifsinpromoterregionsanditsbiologicalrelevance
AT freitasanat analysisofthepositionaldistributionofdnamotifsinpromoterregionsanditsbiologicalrelevance
AT oliveiraarlindol analysisofthepositionaldistributionofdnamotifsinpromoterregionsanditsbiologicalrelevance