Cargando…

MARZ: an algorithm to combinatorially analyze gapped n-mer models of transcription factor binding

BACKGROUND: A key challenge in understanding the molecular mechanisms that control gene regulation is the characterization of the specificity with which transcription factor proteins bind to specific DNA sequences. A number of computational approaches have been developed to examine these interaction...

Descripción completa

Detalles Bibliográficos
Autores principales: Zellers, Rowan G, Drewell, Robert A, Dresch, Jacqueline M
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4384306/
https://www.ncbi.nlm.nih.gov/pubmed/25637281
http://dx.doi.org/10.1186/s12859-014-0446-3
_version_ 1782364882478301184
author Zellers, Rowan G
Drewell, Robert A
Dresch, Jacqueline M
author_facet Zellers, Rowan G
Drewell, Robert A
Dresch, Jacqueline M
author_sort Zellers, Rowan G
collection PubMed
description BACKGROUND: A key challenge in understanding the molecular mechanisms that control gene regulation is the characterization of the specificity with which transcription factor proteins bind to specific DNA sequences. A number of computational approaches have been developed to examine these interactions, including simple mononucleotide and dinucleotide position weight matrix models. RESULTS: Here we develop a novel, unbiased computational algorithm, MARZ, that systematically analyzes all possible gapped matrices across a fixed number of nucleotides. In addition, to evaluate the ability of these matrix models to predict in vivo binding sites, we utilize a new scoring system and, in combination with established scoring methods and statistical analysis, test the performance of 32 different gapped matrices on the well characterized HUNCHBACK transcription factor in Drosophila. CONCLUSIONS: Our results indicate that in many cases gapped matrix models can outperform traditional models, but that the relative strength of the binding sites considered in the analysis can profoundly influence the predictive ability of specific models. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-014-0446-3) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4384306
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-43843062015-04-04 MARZ: an algorithm to combinatorially analyze gapped n-mer models of transcription factor binding Zellers, Rowan G Drewell, Robert A Dresch, Jacqueline M BMC Bioinformatics Methodology Article BACKGROUND: A key challenge in understanding the molecular mechanisms that control gene regulation is the characterization of the specificity with which transcription factor proteins bind to specific DNA sequences. A number of computational approaches have been developed to examine these interactions, including simple mononucleotide and dinucleotide position weight matrix models. RESULTS: Here we develop a novel, unbiased computational algorithm, MARZ, that systematically analyzes all possible gapped matrices across a fixed number of nucleotides. In addition, to evaluate the ability of these matrix models to predict in vivo binding sites, we utilize a new scoring system and, in combination with established scoring methods and statistical analysis, test the performance of 32 different gapped matrices on the well characterized HUNCHBACK transcription factor in Drosophila. CONCLUSIONS: Our results indicate that in many cases gapped matrix models can outperform traditional models, but that the relative strength of the binding sites considered in the analysis can profoundly influence the predictive ability of specific models. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-014-0446-3) contains supplementary material, which is available to authorized users. BioMed Central 2015-01-31 /pmc/articles/PMC4384306/ /pubmed/25637281 http://dx.doi.org/10.1186/s12859-014-0446-3 Text en © Zellers et al.; licensee BioMed Central. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Zellers, Rowan G
Drewell, Robert A
Dresch, Jacqueline M
MARZ: an algorithm to combinatorially analyze gapped n-mer models of transcription factor binding
title MARZ: an algorithm to combinatorially analyze gapped n-mer models of transcription factor binding
title_full MARZ: an algorithm to combinatorially analyze gapped n-mer models of transcription factor binding
title_fullStr MARZ: an algorithm to combinatorially analyze gapped n-mer models of transcription factor binding
title_full_unstemmed MARZ: an algorithm to combinatorially analyze gapped n-mer models of transcription factor binding
title_short MARZ: an algorithm to combinatorially analyze gapped n-mer models of transcription factor binding
title_sort marz: an algorithm to combinatorially analyze gapped n-mer models of transcription factor binding
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4384306/
https://www.ncbi.nlm.nih.gov/pubmed/25637281
http://dx.doi.org/10.1186/s12859-014-0446-3
work_keys_str_mv AT zellersrowang marzanalgorithmtocombinatoriallyanalyzegappednmermodelsoftranscriptionfactorbinding
AT drewellroberta marzanalgorithmtocombinatoriallyanalyzegappednmermodelsoftranscriptionfactorbinding
AT dreschjacquelinem marzanalgorithmtocombinatoriallyanalyzegappednmermodelsoftranscriptionfactorbinding