Cargando…

Extracting transcription factor binding sites from unaligned gene sequences with statistical models

BACKGROUND: Transcription factor binding sites (TFBSs) are crucial in the regulation of gene transcription. Recently, chromatin immunoprecipitation followed by cDNA microarray hybridization (ChIP-chip array) has been used to identify potential regulatory sequences, but the procedure can only map the...

Descripción completa

Detalles Bibliográficos
Autores principales:	Lu, Chung-Chin, Yuan, Wei-Hao, Chen, Te-Ming
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2008
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2638147/ https://www.ncbi.nlm.nih.gov/pubmed/19091030 http://dx.doi.org/10.1186/1471-2105-9-S12-S7

_version_	1782164395590156288
author	Lu, Chung-Chin Yuan, Wei-Hao Chen, Te-Ming
author_facet	Lu, Chung-Chin Yuan, Wei-Hao Chen, Te-Ming
author_sort	Lu, Chung-Chin
collection	PubMed
description	BACKGROUND: Transcription factor binding sites (TFBSs) are crucial in the regulation of gene transcription. Recently, chromatin immunoprecipitation followed by cDNA microarray hybridization (ChIP-chip array) has been used to identify potential regulatory sequences, but the procedure can only map the probable protein-DNA interaction loci within 1–2 kb resolution. To find out the exact binding motifs, it is necessary to build a computational method to examine the ChIP-chip array binding sequences and search for possible motifs representing the transcription factor binding sites. RESULTS: We developed a program to find out accurate motif sites from a set of unaligned DNA sequences in the yeast genome. Compared with MDscan, the prediction results suggest that, overall, our algorithm outperforms MDscan since the predicted motifs are more consistent with previously known specificities reported in the literature and have better prediction ranks. Our program also outperforms the constraint-less Cosmo program, especially in the elimination of false positives. CONCLUSION: In this study, an improved sampling algorithm is proposed to incorporate the binomial probability model to build significant initial candidate motif sets. By investigating the statistical dependence between base positions in TFBSs, the method of dependency graphs and their expanded Bayesian networks is combined. The results show that our program satisfactorily extract transcription factor binding sites from unaligned gene sequences.
format	Text
id	pubmed-2638147
institution	National Center for Biotechnology Information
language	English
publishDate	2008
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-26381472009-02-24 Extracting transcription factor binding sites from unaligned gene sequences with statistical models Lu, Chung-Chin Yuan, Wei-Hao Chen, Te-Ming BMC Bioinformatics Research BACKGROUND: Transcription factor binding sites (TFBSs) are crucial in the regulation of gene transcription. Recently, chromatin immunoprecipitation followed by cDNA microarray hybridization (ChIP-chip array) has been used to identify potential regulatory sequences, but the procedure can only map the probable protein-DNA interaction loci within 1–2 kb resolution. To find out the exact binding motifs, it is necessary to build a computational method to examine the ChIP-chip array binding sequences and search for possible motifs representing the transcription factor binding sites. RESULTS: We developed a program to find out accurate motif sites from a set of unaligned DNA sequences in the yeast genome. Compared with MDscan, the prediction results suggest that, overall, our algorithm outperforms MDscan since the predicted motifs are more consistent with previously known specificities reported in the literature and have better prediction ranks. Our program also outperforms the constraint-less Cosmo program, especially in the elimination of false positives. CONCLUSION: In this study, an improved sampling algorithm is proposed to incorporate the binomial probability model to build significant initial candidate motif sets. By investigating the statistical dependence between base positions in TFBSs, the method of dependency graphs and their expanded Bayesian networks is combined. The results show that our program satisfactorily extract transcription factor binding sites from unaligned gene sequences. BioMed Central 2008-12-12 /pmc/articles/PMC2638147/ /pubmed/19091030 http://dx.doi.org/10.1186/1471-2105-9-S12-S7 Text en Copyright © 2008 Lu et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Lu, Chung-Chin Yuan, Wei-Hao Chen, Te-Ming Extracting transcription factor binding sites from unaligned gene sequences with statistical models
title	Extracting transcription factor binding sites from unaligned gene sequences with statistical models
title_full	Extracting transcription factor binding sites from unaligned gene sequences with statistical models
title_fullStr	Extracting transcription factor binding sites from unaligned gene sequences with statistical models
title_full_unstemmed	Extracting transcription factor binding sites from unaligned gene sequences with statistical models
title_short	Extracting transcription factor binding sites from unaligned gene sequences with statistical models
title_sort	extracting transcription factor binding sites from unaligned gene sequences with statistical models
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2638147/ https://www.ncbi.nlm.nih.gov/pubmed/19091030 http://dx.doi.org/10.1186/1471-2105-9-S12-S7
work_keys_str_mv	AT luchungchin extractingtranscriptionfactorbindingsitesfromunalignedgenesequenceswithstatisticalmodels AT yuanweihao extractingtranscriptionfactorbindingsitesfromunalignedgenesequenceswithstatisticalmodels AT chenteming extractingtranscriptionfactorbindingsitesfromunalignedgenesequenceswithstatisticalmodels

Extracting transcription factor binding sites from unaligned gene sequences with statistical models

Ejemplares similares