Cargando…

WordSpy: identifying transcription factor binding motifs by building a dictionary and learning a grammar

Transcription factor (TF) binding sites or motifs (TFBMs) are functional cis-regulatory DNA sequences that play an essential role in gene transcriptional regulation. Although many experimental and computational methods have been developed, finding TFBMs remains a challenging problem. We propose and...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Guandong, Yu, Taotao, Zhang, Weixiong
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1160252/
https://www.ncbi.nlm.nih.gov/pubmed/15980501
http://dx.doi.org/10.1093/nar/gki492
_version_ 1782124391841136640
author Wang, Guandong
Yu, Taotao
Zhang, Weixiong
author_facet Wang, Guandong
Yu, Taotao
Zhang, Weixiong
author_sort Wang, Guandong
collection PubMed
description Transcription factor (TF) binding sites or motifs (TFBMs) are functional cis-regulatory DNA sequences that play an essential role in gene transcriptional regulation. Although many experimental and computational methods have been developed, finding TFBMs remains a challenging problem. We propose and develop a novel dictionary based motif finding algorithm, which we call WordSpy. One significant feature of WordSpy is the combination of a word counting method and a statistical model which consists of a dictionary of motifs and a grammar specifying their usage. The algorithm is suitable for genome-wide motif finding; it is capable of discovering hundreds of motifs from a large set of promoters in a single run. We further enhance WordSpy by applying gene expression information to separate true TFBMs from spurious ones, and by incorporating negative sequences to identify discriminative motifs. In addition, we also use randomly selected promoters from the genome to evaluate the significance of the discovered motifs. The output from WordSpy consists of an ordered list of putative motifs and a set of regulatory sequences with motif binding sites highlighted. The web server of WordSpy is available at .
format Text
id pubmed-1160252
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-11602522005-06-29 WordSpy: identifying transcription factor binding motifs by building a dictionary and learning a grammar Wang, Guandong Yu, Taotao Zhang, Weixiong Nucleic Acids Res Article Transcription factor (TF) binding sites or motifs (TFBMs) are functional cis-regulatory DNA sequences that play an essential role in gene transcriptional regulation. Although many experimental and computational methods have been developed, finding TFBMs remains a challenging problem. We propose and develop a novel dictionary based motif finding algorithm, which we call WordSpy. One significant feature of WordSpy is the combination of a word counting method and a statistical model which consists of a dictionary of motifs and a grammar specifying their usage. The algorithm is suitable for genome-wide motif finding; it is capable of discovering hundreds of motifs from a large set of promoters in a single run. We further enhance WordSpy by applying gene expression information to separate true TFBMs from spurious ones, and by incorporating negative sequences to identify discriminative motifs. In addition, we also use randomly selected promoters from the genome to evaluate the significance of the discovered motifs. The output from WordSpy consists of an ordered list of putative motifs and a set of regulatory sequences with motif binding sites highlighted. The web server of WordSpy is available at . Oxford University Press 2005-07-01 2005-06-27 /pmc/articles/PMC1160252/ /pubmed/15980501 http://dx.doi.org/10.1093/nar/gki492 Text en © The Author 2005. Published by Oxford University Press. All rights reserved
spellingShingle Article
Wang, Guandong
Yu, Taotao
Zhang, Weixiong
WordSpy: identifying transcription factor binding motifs by building a dictionary and learning a grammar
title WordSpy: identifying transcription factor binding motifs by building a dictionary and learning a grammar
title_full WordSpy: identifying transcription factor binding motifs by building a dictionary and learning a grammar
title_fullStr WordSpy: identifying transcription factor binding motifs by building a dictionary and learning a grammar
title_full_unstemmed WordSpy: identifying transcription factor binding motifs by building a dictionary and learning a grammar
title_short WordSpy: identifying transcription factor binding motifs by building a dictionary and learning a grammar
title_sort wordspy: identifying transcription factor binding motifs by building a dictionary and learning a grammar
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1160252/
https://www.ncbi.nlm.nih.gov/pubmed/15980501
http://dx.doi.org/10.1093/nar/gki492
work_keys_str_mv AT wangguandong wordspyidentifyingtranscriptionfactorbindingmotifsbybuildingadictionaryandlearningagrammar
AT yutaotao wordspyidentifyingtranscriptionfactorbindingmotifsbybuildingadictionaryandlearningagrammar
AT zhangweixiong wordspyidentifyingtranscriptionfactorbindingmotifsbybuildingadictionaryandlearningagrammar