Cargando…

PhyloScan: identification of transcription factor binding sites using cross-species evidence

BACKGROUND: When transcription factor binding sites are known for a particular transcription factor, it is possible to construct a motif model that can be used to scan sequences for additional sites. However, few statistically significant sites are revealed when a transcription factor binding site m...

Descripción completa

Detalles Bibliográficos
Autores principales:	Carmack, C Steven, McCue, Lee Ann, Newberg, Lee A, Lawrence, Charles E
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2007
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1794230/ https://www.ncbi.nlm.nih.gov/pubmed/17244358 http://dx.doi.org/10.1186/1748-7188-2-1

_version_	1782132156466724864
author	Carmack, C Steven McCue, Lee Ann Newberg, Lee A Lawrence, Charles E
author_facet	Carmack, C Steven McCue, Lee Ann Newberg, Lee A Lawrence, Charles E
author_sort	Carmack, C Steven
collection	PubMed
description	BACKGROUND: When transcription factor binding sites are known for a particular transcription factor, it is possible to construct a motif model that can be used to scan sequences for additional sites. However, few statistically significant sites are revealed when a transcription factor binding site motif model is used to scan a genome-scale database. METHODS: We have developed a scanning algorithm, PhyloScan, which combines evidence from matching sites found in orthologous data from several related species with evidence from multiple sites within an intergenic region, to better detect regulons. The orthologous sequence data may be multiply aligned, unaligned, or a combination of aligned and unaligned. In aligned data, PhyloScan statistically accounts for the phylogenetic dependence of the species contributing data to the alignment and, in unaligned data, the evidence for sites is combined assuming phylogenetic independence of the species. The statistical significance of the gene predictions is calculated directly, without employing training sets. RESULTS: In a test of our methodology on synthetic data modeled on seven Enterobacteriales, four Vibrionales, and three Pasteurellales species, PhyloScan produces better sensitivity and specificity than MONKEY, an advanced scanning approach that also searches a genome for transcription factor binding sites using phylogenetic information. The application of the algorithm to real sequence data from seven Enterobacteriales species identifies novel Crp and PurR transcription factor binding sites, thus providing several new potential sites for these transcription factors. These sites enable targeted experimental validation and thus further delineation of the Crp and PurR regulons in E. coli. CONCLUSION: Better sensitivity and specificity can be achieved through a combination of (1) using mixed alignable and non-alignable sequence data and (2) combining evidence from multiple sites within an intergenic region.
format	Text
id	pubmed-1794230
institution	National Center for Biotechnology Information
language	English
publishDate	2007
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-17942302007-02-16 PhyloScan: identification of transcription factor binding sites using cross-species evidence Carmack, C Steven McCue, Lee Ann Newberg, Lee A Lawrence, Charles E Algorithms Mol Biol Research BACKGROUND: When transcription factor binding sites are known for a particular transcription factor, it is possible to construct a motif model that can be used to scan sequences for additional sites. However, few statistically significant sites are revealed when a transcription factor binding site motif model is used to scan a genome-scale database. METHODS: We have developed a scanning algorithm, PhyloScan, which combines evidence from matching sites found in orthologous data from several related species with evidence from multiple sites within an intergenic region, to better detect regulons. The orthologous sequence data may be multiply aligned, unaligned, or a combination of aligned and unaligned. In aligned data, PhyloScan statistically accounts for the phylogenetic dependence of the species contributing data to the alignment and, in unaligned data, the evidence for sites is combined assuming phylogenetic independence of the species. The statistical significance of the gene predictions is calculated directly, without employing training sets. RESULTS: In a test of our methodology on synthetic data modeled on seven Enterobacteriales, four Vibrionales, and three Pasteurellales species, PhyloScan produces better sensitivity and specificity than MONKEY, an advanced scanning approach that also searches a genome for transcription factor binding sites using phylogenetic information. The application of the algorithm to real sequence data from seven Enterobacteriales species identifies novel Crp and PurR transcription factor binding sites, thus providing several new potential sites for these transcription factors. These sites enable targeted experimental validation and thus further delineation of the Crp and PurR regulons in E. coli. CONCLUSION: Better sensitivity and specificity can be achieved through a combination of (1) using mixed alignable and non-alignable sequence data and (2) combining evidence from multiple sites within an intergenic region. BioMed Central 2007-01-23 /pmc/articles/PMC1794230/ /pubmed/17244358 http://dx.doi.org/10.1186/1748-7188-2-1 Text en Copyright © 2007 Carmack et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Carmack, C Steven McCue, Lee Ann Newberg, Lee A Lawrence, Charles E PhyloScan: identification of transcription factor binding sites using cross-species evidence
title	PhyloScan: identification of transcription factor binding sites using cross-species evidence
title_full	PhyloScan: identification of transcription factor binding sites using cross-species evidence
title_fullStr	PhyloScan: identification of transcription factor binding sites using cross-species evidence
title_full_unstemmed	PhyloScan: identification of transcription factor binding sites using cross-species evidence
title_short	PhyloScan: identification of transcription factor binding sites using cross-species evidence
title_sort	phyloscan: identification of transcription factor binding sites using cross-species evidence
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1794230/ https://www.ncbi.nlm.nih.gov/pubmed/17244358 http://dx.doi.org/10.1186/1748-7188-2-1
work_keys_str_mv	AT carmackcsteven phyloscanidentificationoftranscriptionfactorbindingsitesusingcrossspeciesevidence AT mccueleeann phyloscanidentificationoftranscriptionfactorbindingsitesusingcrossspeciesevidence AT newbergleea phyloscanidentificationoftranscriptionfactorbindingsitesusingcrossspeciesevidence AT lawrencecharlese phyloscanidentificationoftranscriptionfactorbindingsitesusingcrossspeciesevidence

PhyloScan: identification of transcription factor binding sites using cross-species evidence

Ejemplares similares