Cargando…

Principal component analysis for predicting transcription-factor binding motifs from array-derived data

BACKGROUND: The responses to interleukin 1 (IL-1) in human chondrocytes constitute a complex regulatory mechanism, where multiple transcription factors interact combinatorially to transcription-factor binding motifs (TFBMs). In order to select a critical set of TFBMs from genomic DNA information and...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Yunlong, Vincenti, Matthew P, Yokota, Hiroki
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1316881/
https://www.ncbi.nlm.nih.gov/pubmed/16297243
http://dx.doi.org/10.1186/1471-2105-6-276
_version_ 1782126401976008704
author Liu, Yunlong
Vincenti, Matthew P
Yokota, Hiroki
author_facet Liu, Yunlong
Vincenti, Matthew P
Yokota, Hiroki
author_sort Liu, Yunlong
collection PubMed
description BACKGROUND: The responses to interleukin 1 (IL-1) in human chondrocytes constitute a complex regulatory mechanism, where multiple transcription factors interact combinatorially to transcription-factor binding motifs (TFBMs). In order to select a critical set of TFBMs from genomic DNA information and an array-derived data, an efficient algorithm to solve a combinatorial optimization problem is required. Although computational approaches based on evolutionary algorithms are commonly employed, an analytical algorithm would be useful to predict TFBMs at nearly no computational cost and evaluate varying modelling conditions. Singular value decomposition (SVD) is a powerful method to derive primary components of a given matrix. Applying SVD to a promoter matrix defined from regulatory DNA sequences, we derived a novel method to predict the critical set of TFBMs. RESULTS: The promoter matrix was defined to establish a quantitative relationship between the IL-1-driven mRNA alteration and genomic DNA sequences of the IL-1 responsive genes. The matrix was decomposed with SVD, and the effects of 8 potential TFBMs (5'-CAGGC-3', 5'-CGCCC-3', 5'-CCGCC-3', 5'-ATGGG-3', 5'-GGGAA-3', 5'-CGTCC-3', 5'-AAAGG-3', and 5'-ACCCA-3') were predicted from a pool of 512 random DNA sequences. The prediction included matches to the core binding motifs of biologically known TFBMs such as AP2, SP1, EGR1, KROX, GC-BOX, ABI4, ETF, E2F, SRF, STAT, IK-1, PPARγ, STAF, ROAZ, and NFκB, and their significance was evaluated numerically using Monte Carlo simulation and genetic algorithm. CONCLUSION: The described SVD-based prediction is an analytical method to provide a set of potential TFBMs involved in transcriptional regulation. The results would be useful to evaluate analytically a contribution of individual DNA sequences.
format Text
id pubmed-1316881
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-13168812006-01-30 Principal component analysis for predicting transcription-factor binding motifs from array-derived data Liu, Yunlong Vincenti, Matthew P Yokota, Hiroki BMC Bioinformatics Methodology Article BACKGROUND: The responses to interleukin 1 (IL-1) in human chondrocytes constitute a complex regulatory mechanism, where multiple transcription factors interact combinatorially to transcription-factor binding motifs (TFBMs). In order to select a critical set of TFBMs from genomic DNA information and an array-derived data, an efficient algorithm to solve a combinatorial optimization problem is required. Although computational approaches based on evolutionary algorithms are commonly employed, an analytical algorithm would be useful to predict TFBMs at nearly no computational cost and evaluate varying modelling conditions. Singular value decomposition (SVD) is a powerful method to derive primary components of a given matrix. Applying SVD to a promoter matrix defined from regulatory DNA sequences, we derived a novel method to predict the critical set of TFBMs. RESULTS: The promoter matrix was defined to establish a quantitative relationship between the IL-1-driven mRNA alteration and genomic DNA sequences of the IL-1 responsive genes. The matrix was decomposed with SVD, and the effects of 8 potential TFBMs (5'-CAGGC-3', 5'-CGCCC-3', 5'-CCGCC-3', 5'-ATGGG-3', 5'-GGGAA-3', 5'-CGTCC-3', 5'-AAAGG-3', and 5'-ACCCA-3') were predicted from a pool of 512 random DNA sequences. The prediction included matches to the core binding motifs of biologically known TFBMs such as AP2, SP1, EGR1, KROX, GC-BOX, ABI4, ETF, E2F, SRF, STAT, IK-1, PPARγ, STAF, ROAZ, and NFκB, and their significance was evaluated numerically using Monte Carlo simulation and genetic algorithm. CONCLUSION: The described SVD-based prediction is an analytical method to provide a set of potential TFBMs involved in transcriptional regulation. The results would be useful to evaluate analytically a contribution of individual DNA sequences. BioMed Central 2005-11-18 /pmc/articles/PMC1316881/ /pubmed/16297243 http://dx.doi.org/10.1186/1471-2105-6-276 Text en Copyright © 2005 Liu et al; licensee BioMed Central Ltd. https://creativecommons.org/licenses/by/2.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0 (https://creativecommons.org/licenses/by/2.0/) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Liu, Yunlong
Vincenti, Matthew P
Yokota, Hiroki
Principal component analysis for predicting transcription-factor binding motifs from array-derived data
title Principal component analysis for predicting transcription-factor binding motifs from array-derived data
title_full Principal component analysis for predicting transcription-factor binding motifs from array-derived data
title_fullStr Principal component analysis for predicting transcription-factor binding motifs from array-derived data
title_full_unstemmed Principal component analysis for predicting transcription-factor binding motifs from array-derived data
title_short Principal component analysis for predicting transcription-factor binding motifs from array-derived data
title_sort principal component analysis for predicting transcription-factor binding motifs from array-derived data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1316881/
https://www.ncbi.nlm.nih.gov/pubmed/16297243
http://dx.doi.org/10.1186/1471-2105-6-276
work_keys_str_mv AT liuyunlong principalcomponentanalysisforpredictingtranscriptionfactorbindingmotifsfromarrayderiveddata
AT vincentimatthewp principalcomponentanalysisforpredictingtranscriptionfactorbindingmotifsfromarrayderiveddata
AT yokotahiroki principalcomponentanalysisforpredictingtranscriptionfactorbindingmotifsfromarrayderiveddata