Cargando…
WordSpy: identifying transcription factor binding motifs by building a dictionary and learning a grammar
Transcription factor (TF) binding sites or motifs (TFBMs) are functional cis-regulatory DNA sequences that play an essential role in gene transcriptional regulation. Although many experimental and computational methods have been developed, finding TFBMs remains a challenging problem. We propose and...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2005
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1160252/ https://www.ncbi.nlm.nih.gov/pubmed/15980501 http://dx.doi.org/10.1093/nar/gki492 |
_version_ | 1782124391841136640 |
---|---|
author | Wang, Guandong Yu, Taotao Zhang, Weixiong |
author_facet | Wang, Guandong Yu, Taotao Zhang, Weixiong |
author_sort | Wang, Guandong |
collection | PubMed |
description | Transcription factor (TF) binding sites or motifs (TFBMs) are functional cis-regulatory DNA sequences that play an essential role in gene transcriptional regulation. Although many experimental and computational methods have been developed, finding TFBMs remains a challenging problem. We propose and develop a novel dictionary based motif finding algorithm, which we call WordSpy. One significant feature of WordSpy is the combination of a word counting method and a statistical model which consists of a dictionary of motifs and a grammar specifying their usage. The algorithm is suitable for genome-wide motif finding; it is capable of discovering hundreds of motifs from a large set of promoters in a single run. We further enhance WordSpy by applying gene expression information to separate true TFBMs from spurious ones, and by incorporating negative sequences to identify discriminative motifs. In addition, we also use randomly selected promoters from the genome to evaluate the significance of the discovered motifs. The output from WordSpy consists of an ordered list of putative motifs and a set of regulatory sequences with motif binding sites highlighted. The web server of WordSpy is available at . |
format | Text |
id | pubmed-1160252 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2005 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-11602522005-06-29 WordSpy: identifying transcription factor binding motifs by building a dictionary and learning a grammar Wang, Guandong Yu, Taotao Zhang, Weixiong Nucleic Acids Res Article Transcription factor (TF) binding sites or motifs (TFBMs) are functional cis-regulatory DNA sequences that play an essential role in gene transcriptional regulation. Although many experimental and computational methods have been developed, finding TFBMs remains a challenging problem. We propose and develop a novel dictionary based motif finding algorithm, which we call WordSpy. One significant feature of WordSpy is the combination of a word counting method and a statistical model which consists of a dictionary of motifs and a grammar specifying their usage. The algorithm is suitable for genome-wide motif finding; it is capable of discovering hundreds of motifs from a large set of promoters in a single run. We further enhance WordSpy by applying gene expression information to separate true TFBMs from spurious ones, and by incorporating negative sequences to identify discriminative motifs. In addition, we also use randomly selected promoters from the genome to evaluate the significance of the discovered motifs. The output from WordSpy consists of an ordered list of putative motifs and a set of regulatory sequences with motif binding sites highlighted. The web server of WordSpy is available at . Oxford University Press 2005-07-01 2005-06-27 /pmc/articles/PMC1160252/ /pubmed/15980501 http://dx.doi.org/10.1093/nar/gki492 Text en © The Author 2005. Published by Oxford University Press. All rights reserved |
spellingShingle | Article Wang, Guandong Yu, Taotao Zhang, Weixiong WordSpy: identifying transcription factor binding motifs by building a dictionary and learning a grammar |
title | WordSpy: identifying transcription factor binding motifs by building a dictionary and learning a grammar |
title_full | WordSpy: identifying transcription factor binding motifs by building a dictionary and learning a grammar |
title_fullStr | WordSpy: identifying transcription factor binding motifs by building a dictionary and learning a grammar |
title_full_unstemmed | WordSpy: identifying transcription factor binding motifs by building a dictionary and learning a grammar |
title_short | WordSpy: identifying transcription factor binding motifs by building a dictionary and learning a grammar |
title_sort | wordspy: identifying transcription factor binding motifs by building a dictionary and learning a grammar |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1160252/ https://www.ncbi.nlm.nih.gov/pubmed/15980501 http://dx.doi.org/10.1093/nar/gki492 |
work_keys_str_mv | AT wangguandong wordspyidentifyingtranscriptionfactorbindingmotifsbybuildingadictionaryandlearningagrammar AT yutaotao wordspyidentifyingtranscriptionfactorbindingmotifsbybuildingadictionaryandlearningagrammar AT zhangweixiong wordspyidentifyingtranscriptionfactorbindingmotifsbybuildingadictionaryandlearningagrammar |