Cargando…

WeederH: an algorithm for finding conserved regulatory motifs and regions in homologous sequences

BACKGROUND: This work addresses the problem of detecting conserved transcription factor binding sites and in general regulatory regions through the analysis of sequences from homologous genes, an approach that is becoming more and more widely used given the ever increasing amount of genomic data ava...

Descripción completa

Detalles Bibliográficos
Autores principales: Pavesi, Giulio, Zambelli, Federico, Pesole, Graziano
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1803799/
https://www.ncbi.nlm.nih.gov/pubmed/17286865
http://dx.doi.org/10.1186/1471-2105-8-46
_version_ 1782132436228898816
author Pavesi, Giulio
Zambelli, Federico
Pesole, Graziano
author_facet Pavesi, Giulio
Zambelli, Federico
Pesole, Graziano
author_sort Pavesi, Giulio
collection PubMed
description BACKGROUND: This work addresses the problem of detecting conserved transcription factor binding sites and in general regulatory regions through the analysis of sequences from homologous genes, an approach that is becoming more and more widely used given the ever increasing amount of genomic data available. RESULTS: We present an algorithm that identifies conserved transcription factor binding sites in a given sequence by comparing it to one or more homologs, adapting a framework we previously introduced for the discovery of sites in sequences from co-regulated genes. Differently from the most commonly used methods, the approach we present does not need or compute an alignment of the sequences investigated, nor resorts to descriptors of the binding specificity of known transcription factors. The main novel idea we introduce is a relative measure of conservation, assuming that true functional elements should present a higher level of conservation with respect to the rest of the sequence surrounding them. We present tests where we applied the algorithm to the identification of conserved annotated sites in homologous promoters, as well as in distal regions like enhancers. CONCLUSION: Results of the tests show how the algorithm can provide fast and reliable predictions of conserved transcription factor binding sites regulating the transcription of a gene, with better performances than other available methods for the same task. We also show examples on how the algorithm can be successfully employed when promoter annotations of the genes investigated are missing, or when regulatory sites and regions are located far away from the genes.
format Text
id pubmed-1803799
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-18037992007-02-23 WeederH: an algorithm for finding conserved regulatory motifs and regions in homologous sequences Pavesi, Giulio Zambelli, Federico Pesole, Graziano BMC Bioinformatics Research Article BACKGROUND: This work addresses the problem of detecting conserved transcription factor binding sites and in general regulatory regions through the analysis of sequences from homologous genes, an approach that is becoming more and more widely used given the ever increasing amount of genomic data available. RESULTS: We present an algorithm that identifies conserved transcription factor binding sites in a given sequence by comparing it to one or more homologs, adapting a framework we previously introduced for the discovery of sites in sequences from co-regulated genes. Differently from the most commonly used methods, the approach we present does not need or compute an alignment of the sequences investigated, nor resorts to descriptors of the binding specificity of known transcription factors. The main novel idea we introduce is a relative measure of conservation, assuming that true functional elements should present a higher level of conservation with respect to the rest of the sequence surrounding them. We present tests where we applied the algorithm to the identification of conserved annotated sites in homologous promoters, as well as in distal regions like enhancers. CONCLUSION: Results of the tests show how the algorithm can provide fast and reliable predictions of conserved transcription factor binding sites regulating the transcription of a gene, with better performances than other available methods for the same task. We also show examples on how the algorithm can be successfully employed when promoter annotations of the genes investigated are missing, or when regulatory sites and regions are located far away from the genes. BioMed Central 2007-02-07 /pmc/articles/PMC1803799/ /pubmed/17286865 http://dx.doi.org/10.1186/1471-2105-8-46 Text en Copyright © 2007 Pavesi et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Pavesi, Giulio
Zambelli, Federico
Pesole, Graziano
WeederH: an algorithm for finding conserved regulatory motifs and regions in homologous sequences
title WeederH: an algorithm for finding conserved regulatory motifs and regions in homologous sequences
title_full WeederH: an algorithm for finding conserved regulatory motifs and regions in homologous sequences
title_fullStr WeederH: an algorithm for finding conserved regulatory motifs and regions in homologous sequences
title_full_unstemmed WeederH: an algorithm for finding conserved regulatory motifs and regions in homologous sequences
title_short WeederH: an algorithm for finding conserved regulatory motifs and regions in homologous sequences
title_sort weederh: an algorithm for finding conserved regulatory motifs and regions in homologous sequences
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1803799/
https://www.ncbi.nlm.nih.gov/pubmed/17286865
http://dx.doi.org/10.1186/1471-2105-8-46
work_keys_str_mv AT pavesigiulio weederhanalgorithmforfindingconservedregulatorymotifsandregionsinhomologoussequences
AT zambellifederico weederhanalgorithmforfindingconservedregulatorymotifsandregionsinhomologoussequences
AT pesolegraziano weederhanalgorithmforfindingconservedregulatorymotifsandregionsinhomologoussequences