Cargando…

De-Novo Discovery of Differentially Abundant Transcription Factor Binding Sites Including Their Positional Preference

Transcription factors are a main component of gene regulation as they activate or repress gene expression by binding to specific binding sites in promoters. The de-novo discovery of transcription factor binding sites in target regions obtained by wet-lab experiments is a challenging problem in compu...

Descripción completa

Detalles Bibliográficos
Autores principales: Keilwagen, Jens, Grau, Jan, Paponov, Ivan A., Posch, Stefan, Strickert, Marc, Grosse, Ivo
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3037384/
https://www.ncbi.nlm.nih.gov/pubmed/21347314
http://dx.doi.org/10.1371/journal.pcbi.1001070
_version_ 1782197986381529088
author Keilwagen, Jens
Grau, Jan
Paponov, Ivan A.
Posch, Stefan
Strickert, Marc
Grosse, Ivo
author_facet Keilwagen, Jens
Grau, Jan
Paponov, Ivan A.
Posch, Stefan
Strickert, Marc
Grosse, Ivo
author_sort Keilwagen, Jens
collection PubMed
description Transcription factors are a main component of gene regulation as they activate or repress gene expression by binding to specific binding sites in promoters. The de-novo discovery of transcription factor binding sites in target regions obtained by wet-lab experiments is a challenging problem in computational biology, which has not been fully solved yet. Here, we present a de-novo motif discovery tool called Dispom for finding differentially abundant transcription factor binding sites that models existing positional preferences of binding sites and adjusts the length of the motif in the learning process. Evaluating Dispom, we find that its prediction performance is superior to existing tools for de-novo motif discovery for 18 benchmark data sets with planted binding sites, and for a metazoan compendium based on experimental data from micro-array, ChIP-chip, ChIP-DSL, and DamID as well as Gene Ontology data. Finally, we apply Dispom to find binding sites differentially abundant in promoters of auxin-responsive genes extracted from Arabidopsis thaliana microarray data, and we find a motif that can be interpreted as a refined auxin responsive element predominately positioned in the 250-bp region upstream of the transcription start site. Using an independent data set of auxin-responsive genes, we find in genome-wide predictions that the refined motif is more specific for auxin-responsive genes than the canonical auxin-responsive element. In general, Dispom can be used to find differentially abundant motifs in sequences of any origin. However, the positional distribution learned by Dispom is especially beneficial if all sequences are aligned to some anchor point like the transcription start site in case of promoter sequences. We demonstrate that the combination of searching for differentially abundant motifs and inferring a position distribution from the data is beneficial for de-novo motif discovery. Hence, we make the tool freely available as a component of the open-source Java framework Jstacs and as a stand-alone application at http://www.jstacs.de/index.php/Dispom.
format Text
id pubmed-3037384
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-30373842011-02-23 De-Novo Discovery of Differentially Abundant Transcription Factor Binding Sites Including Their Positional Preference Keilwagen, Jens Grau, Jan Paponov, Ivan A. Posch, Stefan Strickert, Marc Grosse, Ivo PLoS Comput Biol Research Article Transcription factors are a main component of gene regulation as they activate or repress gene expression by binding to specific binding sites in promoters. The de-novo discovery of transcription factor binding sites in target regions obtained by wet-lab experiments is a challenging problem in computational biology, which has not been fully solved yet. Here, we present a de-novo motif discovery tool called Dispom for finding differentially abundant transcription factor binding sites that models existing positional preferences of binding sites and adjusts the length of the motif in the learning process. Evaluating Dispom, we find that its prediction performance is superior to existing tools for de-novo motif discovery for 18 benchmark data sets with planted binding sites, and for a metazoan compendium based on experimental data from micro-array, ChIP-chip, ChIP-DSL, and DamID as well as Gene Ontology data. Finally, we apply Dispom to find binding sites differentially abundant in promoters of auxin-responsive genes extracted from Arabidopsis thaliana microarray data, and we find a motif that can be interpreted as a refined auxin responsive element predominately positioned in the 250-bp region upstream of the transcription start site. Using an independent data set of auxin-responsive genes, we find in genome-wide predictions that the refined motif is more specific for auxin-responsive genes than the canonical auxin-responsive element. In general, Dispom can be used to find differentially abundant motifs in sequences of any origin. However, the positional distribution learned by Dispom is especially beneficial if all sequences are aligned to some anchor point like the transcription start site in case of promoter sequences. We demonstrate that the combination of searching for differentially abundant motifs and inferring a position distribution from the data is beneficial for de-novo motif discovery. Hence, we make the tool freely available as a component of the open-source Java framework Jstacs and as a stand-alone application at http://www.jstacs.de/index.php/Dispom. Public Library of Science 2011-02-10 /pmc/articles/PMC3037384/ /pubmed/21347314 http://dx.doi.org/10.1371/journal.pcbi.1001070 Text en Keilwagen et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Keilwagen, Jens
Grau, Jan
Paponov, Ivan A.
Posch, Stefan
Strickert, Marc
Grosse, Ivo
De-Novo Discovery of Differentially Abundant Transcription Factor Binding Sites Including Their Positional Preference
title De-Novo Discovery of Differentially Abundant Transcription Factor Binding Sites Including Their Positional Preference
title_full De-Novo Discovery of Differentially Abundant Transcription Factor Binding Sites Including Their Positional Preference
title_fullStr De-Novo Discovery of Differentially Abundant Transcription Factor Binding Sites Including Their Positional Preference
title_full_unstemmed De-Novo Discovery of Differentially Abundant Transcription Factor Binding Sites Including Their Positional Preference
title_short De-Novo Discovery of Differentially Abundant Transcription Factor Binding Sites Including Their Positional Preference
title_sort de-novo discovery of differentially abundant transcription factor binding sites including their positional preference
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3037384/
https://www.ncbi.nlm.nih.gov/pubmed/21347314
http://dx.doi.org/10.1371/journal.pcbi.1001070
work_keys_str_mv AT keilwagenjens denovodiscoveryofdifferentiallyabundanttranscriptionfactorbindingsitesincludingtheirpositionalpreference
AT graujan denovodiscoveryofdifferentiallyabundanttranscriptionfactorbindingsitesincludingtheirpositionalpreference
AT paponovivana denovodiscoveryofdifferentiallyabundanttranscriptionfactorbindingsitesincludingtheirpositionalpreference
AT poschstefan denovodiscoveryofdifferentiallyabundanttranscriptionfactorbindingsitesincludingtheirpositionalpreference
AT strickertmarc denovodiscoveryofdifferentiallyabundanttranscriptionfactorbindingsitesincludingtheirpositionalpreference
AT grosseivo denovodiscoveryofdifferentiallyabundanttranscriptionfactorbindingsitesincludingtheirpositionalpreference