Cargando…

Varying levels of complexity in transcription factor binding motifs

Binding of transcription factors to DNA is one of the keystones of gene regulation. The existence of statistical dependencies between binding site positions is widely accepted, while their relevance for computational predictions has been debated. Building probabilistic models of binding sites that m...

Descripción completa

Detalles Bibliográficos
Autores principales:	Keilwagen, Jens, Grau, Jan
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2015
Materias:	Methods Online
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4605289/ https://www.ncbi.nlm.nih.gov/pubmed/26116565 http://dx.doi.org/10.1093/nar/gkv577

_version_	1782395185989156864
author	Keilwagen, Jens Grau, Jan
author_facet	Keilwagen, Jens Grau, Jan
author_sort	Keilwagen, Jens
collection	PubMed
description	Binding of transcription factors to DNA is one of the keystones of gene regulation. The existence of statistical dependencies between binding site positions is widely accepted, while their relevance for computational predictions has been debated. Building probabilistic models of binding sites that may capture dependencies is still challenging, since the most successful motif discovery approaches require numerical optimization techniques, which are not suited for selecting dependency structures. To overcome this issue, we propose sparse local inhomogeneous mixture (Slim) models that combine putative dependency structures in a weighted manner allowing for numerical optimization of dependency structure and model parameters simultaneously. We find that Slim models yield a substantially better prediction performance than previous models on genomic context protein binding microarray data sets and on ChIP-seq data sets. To elucidate the reasons for the improved performance, we develop dependency logos, which allow for visual inspection of dependency structures within binding sites. We find that the dependency structures discovered by Slim models are highly diverse and highly transcription factor-specific, which emphasizes the need for flexible dependency models. The observed dependency structures range from broad heterogeneities to sparse dependencies between neighboring and non-neighboring binding site positions.
format	Online Article Text
id	pubmed-4605289
institution	National Center for Biotechnology Information
language	English
publishDate	2015
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-46052892015-10-19 Varying levels of complexity in transcription factor binding motifs Keilwagen, Jens Grau, Jan Nucleic Acids Res Methods Online Binding of transcription factors to DNA is one of the keystones of gene regulation. The existence of statistical dependencies between binding site positions is widely accepted, while their relevance for computational predictions has been debated. Building probabilistic models of binding sites that may capture dependencies is still challenging, since the most successful motif discovery approaches require numerical optimization techniques, which are not suited for selecting dependency structures. To overcome this issue, we propose sparse local inhomogeneous mixture (Slim) models that combine putative dependency structures in a weighted manner allowing for numerical optimization of dependency structure and model parameters simultaneously. We find that Slim models yield a substantially better prediction performance than previous models on genomic context protein binding microarray data sets and on ChIP-seq data sets. To elucidate the reasons for the improved performance, we develop dependency logos, which allow for visual inspection of dependency structures within binding sites. We find that the dependency structures discovered by Slim models are highly diverse and highly transcription factor-specific, which emphasizes the need for flexible dependency models. The observed dependency structures range from broad heterogeneities to sparse dependencies between neighboring and non-neighboring binding site positions. Oxford University Press 2015-10-15 2015-10-10 /pmc/articles/PMC4605289/ /pubmed/26116565 http://dx.doi.org/10.1093/nar/gkv577 Text en © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle	Methods Online Keilwagen, Jens Grau, Jan Varying levels of complexity in transcription factor binding motifs
title	Varying levels of complexity in transcription factor binding motifs
title_full	Varying levels of complexity in transcription factor binding motifs
title_fullStr	Varying levels of complexity in transcription factor binding motifs
title_full_unstemmed	Varying levels of complexity in transcription factor binding motifs
title_short	Varying levels of complexity in transcription factor binding motifs
title_sort	varying levels of complexity in transcription factor binding motifs
topic	Methods Online
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4605289/ https://www.ncbi.nlm.nih.gov/pubmed/26116565 http://dx.doi.org/10.1093/nar/gkv577
work_keys_str_mv	AT keilwagenjens varyinglevelsofcomplexityintranscriptionfactorbindingmotifs AT graujan varyinglevelsofcomplexityintranscriptionfactorbindingmotifs

Varying levels of complexity in transcription factor binding motifs

Ejemplares similares