Cargando…
FISim: A new similarity measure between transcription factor binding sites based on the fuzzy integral
BACKGROUND: Regulatory motifs describe sets of related transcription factor binding sites (TFBSs) and can be represented as position frequency matrices (PFMs). De novo identification of TFBSs is a crucial problem in computational biology which includes the issue of comparing putative motifs with one...
Autores principales: | , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2009
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2722654/ https://www.ncbi.nlm.nih.gov/pubmed/19615102 http://dx.doi.org/10.1186/1471-2105-10-224 |
_version_ | 1782170321862787072 |
---|---|
author | Garcia, Fernando Lopez, Francisco J Cano, Carlos Blanco, Armando |
author_facet | Garcia, Fernando Lopez, Francisco J Cano, Carlos Blanco, Armando |
author_sort | Garcia, Fernando |
collection | PubMed |
description | BACKGROUND: Regulatory motifs describe sets of related transcription factor binding sites (TFBSs) and can be represented as position frequency matrices (PFMs). De novo identification of TFBSs is a crucial problem in computational biology which includes the issue of comparing putative motifs with one another and with motifs that are already known. The relative importance of each nucleotide within a given position in the PFMs should be considered in order to compute PFM similarities. Furthermore, biological data are inherently noisy and imprecise. Fuzzy set theory is particularly suitable for modeling imprecise data, whereas fuzzy integrals are highly appropriate for representing the interaction among different information sources. RESULTS: We propose FISim, a new similarity measure between PFMs, based on the fuzzy integral of the distance of the nucleotides with respect to the information content of the positions. Unlike existing methods, FISim is designed to consider the higher contribution of better conserved positions to the binding affinity. FISim provides excellent results when dealing with sets of randomly generated motifs, and outperforms the remaining methods when handling real datasets of related motifs. Furthermore, we propose a new cluster methodology based on kernel theory together with FISim to obtain groups of related motifs potentially bound by the same TFs, providing more robust results than existing approaches. CONCLUSION: FISim corrects a design flaw of the most popular methods, whose measures favour similarity of low information content positions. We use our measure to successfully identify motifs that describe binding sites for the same TF and to solve real-life problems. In this study the reliability of fuzzy technology for motif comparison tasks is proven. |
format | Text |
id | pubmed-2722654 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2009 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-27226542009-08-07 FISim: A new similarity measure between transcription factor binding sites based on the fuzzy integral Garcia, Fernando Lopez, Francisco J Cano, Carlos Blanco, Armando BMC Bioinformatics Research Article BACKGROUND: Regulatory motifs describe sets of related transcription factor binding sites (TFBSs) and can be represented as position frequency matrices (PFMs). De novo identification of TFBSs is a crucial problem in computational biology which includes the issue of comparing putative motifs with one another and with motifs that are already known. The relative importance of each nucleotide within a given position in the PFMs should be considered in order to compute PFM similarities. Furthermore, biological data are inherently noisy and imprecise. Fuzzy set theory is particularly suitable for modeling imprecise data, whereas fuzzy integrals are highly appropriate for representing the interaction among different information sources. RESULTS: We propose FISim, a new similarity measure between PFMs, based on the fuzzy integral of the distance of the nucleotides with respect to the information content of the positions. Unlike existing methods, FISim is designed to consider the higher contribution of better conserved positions to the binding affinity. FISim provides excellent results when dealing with sets of randomly generated motifs, and outperforms the remaining methods when handling real datasets of related motifs. Furthermore, we propose a new cluster methodology based on kernel theory together with FISim to obtain groups of related motifs potentially bound by the same TFs, providing more robust results than existing approaches. CONCLUSION: FISim corrects a design flaw of the most popular methods, whose measures favour similarity of low information content positions. We use our measure to successfully identify motifs that describe binding sites for the same TF and to solve real-life problems. In this study the reliability of fuzzy technology for motif comparison tasks is proven. BioMed Central 2009-07-20 /pmc/articles/PMC2722654/ /pubmed/19615102 http://dx.doi.org/10.1186/1471-2105-10-224 Text en Copyright © 2009 Garcia et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Garcia, Fernando Lopez, Francisco J Cano, Carlos Blanco, Armando FISim: A new similarity measure between transcription factor binding sites based on the fuzzy integral |
title | FISim: A new similarity measure between transcription factor binding sites based on the fuzzy integral |
title_full | FISim: A new similarity measure between transcription factor binding sites based on the fuzzy integral |
title_fullStr | FISim: A new similarity measure between transcription factor binding sites based on the fuzzy integral |
title_full_unstemmed | FISim: A new similarity measure between transcription factor binding sites based on the fuzzy integral |
title_short | FISim: A new similarity measure between transcription factor binding sites based on the fuzzy integral |
title_sort | fisim: a new similarity measure between transcription factor binding sites based on the fuzzy integral |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2722654/ https://www.ncbi.nlm.nih.gov/pubmed/19615102 http://dx.doi.org/10.1186/1471-2105-10-224 |
work_keys_str_mv | AT garciafernando fisimanewsimilaritymeasurebetweentranscriptionfactorbindingsitesbasedonthefuzzyintegral AT lopezfranciscoj fisimanewsimilaritymeasurebetweentranscriptionfactorbindingsitesbasedonthefuzzyintegral AT canocarlos fisimanewsimilaritymeasurebetweentranscriptionfactorbindingsitesbasedonthefuzzyintegral AT blancoarmando fisimanewsimilaritymeasurebetweentranscriptionfactorbindingsitesbasedonthefuzzyintegral |