Cargando…

Increasing Coverage of Transcription Factor Position Weight Matrices through Domain-level Homology

Transcription factor-DNA interactions, central to cellular regulation and control, are commonly described by position weight matrices (PWMs). These matrices are frequently used to predict transcription factor binding sites in regulatory regions of DNA to complement and guide further experimental inv...

Descripción completa

Detalles Bibliográficos
Autores principales: Bernard, Brady, Thorsson, Vesteinn, Rovira, Hector, Shmulevich, Ilya
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3428306/
https://www.ncbi.nlm.nih.gov/pubmed/22952610
http://dx.doi.org/10.1371/journal.pone.0042779
_version_ 1782241681855217664
author Bernard, Brady
Thorsson, Vesteinn
Rovira, Hector
Shmulevich, Ilya
author_facet Bernard, Brady
Thorsson, Vesteinn
Rovira, Hector
Shmulevich, Ilya
author_sort Bernard, Brady
collection PubMed
description Transcription factor-DNA interactions, central to cellular regulation and control, are commonly described by position weight matrices (PWMs). These matrices are frequently used to predict transcription factor binding sites in regulatory regions of DNA to complement and guide further experimental investigation. The DNA sequence preferences of transcription factors, encoded in PWMs, are dictated primarily by select residues within the DNA binding domain(s) that interact directly with DNA. Therefore, the DNA binding properties of homologous transcription factors with identical DNA binding domains may be characterized by PWMs derived from different species. Accordingly, we have implemented a fully automated domain-level homology searching method for identical DNA binding sequences. By applying the domain-level homology search to transcription factors with existing PWMs in the JASPAR and TRANSFAC databases, we were able to significantly increase coverage in terms of the total number of PWMs associated with a given species, assign PWMs to transcription factors that did not previously have any associations, and increase the number of represented species with PWMs over an order of magnitude. Additionally, using protein binding microarray (PBM) data, we have validated the domain-level method by demonstrating that transcription factor pairs with matching DNA binding domains exhibit comparable DNA binding specificity predictions to transcription factor pairs with completely identical sequences. The increased coverage achieved herein demonstrates the potential for more thorough species-associated investigation of protein-DNA interactions using existing resources. The PWM scanning results highlight the challenging nature of transcription factors that contain multiple DNA binding domains, as well as the impact of motif discovery on the ability to predict DNA binding properties. The method is additionally suitable for identifying domain-level homology mappings to enable utilization of additional information sources in the study of transcription factors. The domain-level homology search method, resulting PWM mappings, web-based user interface, and web API are publicly available at http://dodoma.systemsbiology.netdodoma.systemsbiology.net.
format Online
Article
Text
id pubmed-3428306
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-34283062012-09-05 Increasing Coverage of Transcription Factor Position Weight Matrices through Domain-level Homology Bernard, Brady Thorsson, Vesteinn Rovira, Hector Shmulevich, Ilya PLoS One Research Article Transcription factor-DNA interactions, central to cellular regulation and control, are commonly described by position weight matrices (PWMs). These matrices are frequently used to predict transcription factor binding sites in regulatory regions of DNA to complement and guide further experimental investigation. The DNA sequence preferences of transcription factors, encoded in PWMs, are dictated primarily by select residues within the DNA binding domain(s) that interact directly with DNA. Therefore, the DNA binding properties of homologous transcription factors with identical DNA binding domains may be characterized by PWMs derived from different species. Accordingly, we have implemented a fully automated domain-level homology searching method for identical DNA binding sequences. By applying the domain-level homology search to transcription factors with existing PWMs in the JASPAR and TRANSFAC databases, we were able to significantly increase coverage in terms of the total number of PWMs associated with a given species, assign PWMs to transcription factors that did not previously have any associations, and increase the number of represented species with PWMs over an order of magnitude. Additionally, using protein binding microarray (PBM) data, we have validated the domain-level method by demonstrating that transcription factor pairs with matching DNA binding domains exhibit comparable DNA binding specificity predictions to transcription factor pairs with completely identical sequences. The increased coverage achieved herein demonstrates the potential for more thorough species-associated investigation of protein-DNA interactions using existing resources. The PWM scanning results highlight the challenging nature of transcription factors that contain multiple DNA binding domains, as well as the impact of motif discovery on the ability to predict DNA binding properties. The method is additionally suitable for identifying domain-level homology mappings to enable utilization of additional information sources in the study of transcription factors. The domain-level homology search method, resulting PWM mappings, web-based user interface, and web API are publicly available at http://dodoma.systemsbiology.netdodoma.systemsbiology.net. Public Library of Science 2012-08-27 /pmc/articles/PMC3428306/ /pubmed/22952610 http://dx.doi.org/10.1371/journal.pone.0042779 Text en © 2012 Bernard et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Bernard, Brady
Thorsson, Vesteinn
Rovira, Hector
Shmulevich, Ilya
Increasing Coverage of Transcription Factor Position Weight Matrices through Domain-level Homology
title Increasing Coverage of Transcription Factor Position Weight Matrices through Domain-level Homology
title_full Increasing Coverage of Transcription Factor Position Weight Matrices through Domain-level Homology
title_fullStr Increasing Coverage of Transcription Factor Position Weight Matrices through Domain-level Homology
title_full_unstemmed Increasing Coverage of Transcription Factor Position Weight Matrices through Domain-level Homology
title_short Increasing Coverage of Transcription Factor Position Weight Matrices through Domain-level Homology
title_sort increasing coverage of transcription factor position weight matrices through domain-level homology
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3428306/
https://www.ncbi.nlm.nih.gov/pubmed/22952610
http://dx.doi.org/10.1371/journal.pone.0042779
work_keys_str_mv AT bernardbrady increasingcoverageoftranscriptionfactorpositionweightmatricesthroughdomainlevelhomology
AT thorssonvesteinn increasingcoverageoftranscriptionfactorpositionweightmatricesthroughdomainlevelhomology
AT rovirahector increasingcoverageoftranscriptionfactorpositionweightmatricesthroughdomainlevelhomology
AT shmulevichilya increasingcoverageoftranscriptionfactorpositionweightmatricesthroughdomainlevelhomology