Cargando…

Meta-analysis discovery of tissue-specific DNA sequence motifs from mammalian gene expression data

BACKGROUND: A key step in the regulation of gene expression is the sequence-specific binding of transcription factors (TFs) to their DNA recognition sites. However, elucidating TF binding site (TFBS) motifs in higher eukaryotes has been challenging, even when employing cross-species sequence conserv...

Descripción completa

Detalles Bibliográficos
Autores principales: Huber, Bertrand R, Bulyk, Martha L
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1522027/
https://www.ncbi.nlm.nih.gov/pubmed/16643658
http://dx.doi.org/10.1186/1471-2105-7-229
_version_ 1782128788104019968
author Huber, Bertrand R
Bulyk, Martha L
author_facet Huber, Bertrand R
Bulyk, Martha L
author_sort Huber, Bertrand R
collection PubMed
description BACKGROUND: A key step in the regulation of gene expression is the sequence-specific binding of transcription factors (TFs) to their DNA recognition sites. However, elucidating TF binding site (TFBS) motifs in higher eukaryotes has been challenging, even when employing cross-species sequence conservation. We hypothesized that for human and mouse, many orthologous genes expressed in a similarly tissue-specific manner in both human and mouse gene expression data, are likely to be co-regulated by orthologous TFs that bind to DNA sequence motifs present within noncoding sequence conserved between these genomes. RESULTS: We performed automated motif searching and merging across four different motif finding algorithms, followed by filtering of the resulting motifs for those that contain blocks of information content. Applying this motif finding strategy to conserved noncoding regions surrounding co-expressed tissue-specific human genes allowed us to discover both previously known, and many novel candidate, regulatory DNA motifs in all 18 tissue-specific expression clusters that we examined. For previously known TFBS motifs, we observed that if a TF was expressed in the specified tissue of interest, then in most cases we identified a motif that matched its TRANSFAC motif; conversely, of all those discovered motifs that matched TRANSFAC motifs, most of the corresponding TF transcripts were expressed in the tissue(s) corresponding to the expression cluster for which the motif was found. CONCLUSION: Our results indicate that the integration of the results from multiple motif finding tools identifies and ranks highly more known and novel motifs than does the use of just one of these tools. In addition, we believe that our simultaneous enrichment strategies helped to identify likely human cis regulatory elements. A number of the discovered motifs may correspond to novel binding site motifs for as yet uncharacterized tissue-specific TFs. We expect this strategy to be useful for identifying motifs in other metazoan genomes.
format Text
id pubmed-1522027
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-15220272006-07-28 Meta-analysis discovery of tissue-specific DNA sequence motifs from mammalian gene expression data Huber, Bertrand R Bulyk, Martha L BMC Bioinformatics Research Article BACKGROUND: A key step in the regulation of gene expression is the sequence-specific binding of transcription factors (TFs) to their DNA recognition sites. However, elucidating TF binding site (TFBS) motifs in higher eukaryotes has been challenging, even when employing cross-species sequence conservation. We hypothesized that for human and mouse, many orthologous genes expressed in a similarly tissue-specific manner in both human and mouse gene expression data, are likely to be co-regulated by orthologous TFs that bind to DNA sequence motifs present within noncoding sequence conserved between these genomes. RESULTS: We performed automated motif searching and merging across four different motif finding algorithms, followed by filtering of the resulting motifs for those that contain blocks of information content. Applying this motif finding strategy to conserved noncoding regions surrounding co-expressed tissue-specific human genes allowed us to discover both previously known, and many novel candidate, regulatory DNA motifs in all 18 tissue-specific expression clusters that we examined. For previously known TFBS motifs, we observed that if a TF was expressed in the specified tissue of interest, then in most cases we identified a motif that matched its TRANSFAC motif; conversely, of all those discovered motifs that matched TRANSFAC motifs, most of the corresponding TF transcripts were expressed in the tissue(s) corresponding to the expression cluster for which the motif was found. CONCLUSION: Our results indicate that the integration of the results from multiple motif finding tools identifies and ranks highly more known and novel motifs than does the use of just one of these tools. In addition, we believe that our simultaneous enrichment strategies helped to identify likely human cis regulatory elements. A number of the discovered motifs may correspond to novel binding site motifs for as yet uncharacterized tissue-specific TFs. We expect this strategy to be useful for identifying motifs in other metazoan genomes. BioMed Central 2006-04-27 /pmc/articles/PMC1522027/ /pubmed/16643658 http://dx.doi.org/10.1186/1471-2105-7-229 Text en Copyright © 2006 Huber and Bulyk; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Huber, Bertrand R
Bulyk, Martha L
Meta-analysis discovery of tissue-specific DNA sequence motifs from mammalian gene expression data
title Meta-analysis discovery of tissue-specific DNA sequence motifs from mammalian gene expression data
title_full Meta-analysis discovery of tissue-specific DNA sequence motifs from mammalian gene expression data
title_fullStr Meta-analysis discovery of tissue-specific DNA sequence motifs from mammalian gene expression data
title_full_unstemmed Meta-analysis discovery of tissue-specific DNA sequence motifs from mammalian gene expression data
title_short Meta-analysis discovery of tissue-specific DNA sequence motifs from mammalian gene expression data
title_sort meta-analysis discovery of tissue-specific dna sequence motifs from mammalian gene expression data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1522027/
https://www.ncbi.nlm.nih.gov/pubmed/16643658
http://dx.doi.org/10.1186/1471-2105-7-229
work_keys_str_mv AT huberbertrandr metaanalysisdiscoveryoftissuespecificdnasequencemotifsfrommammaliangeneexpressiondata
AT bulykmarthal metaanalysisdiscoveryoftissuespecificdnasequencemotifsfrommammaliangeneexpressiondata