Cargando…

An intuitionistic approach to scoring DNA sequences against transcription factor binding site motifs

BACKGROUND: Transcription factors (TFs) control transcription by binding to specific regions of DNA called transcription factor binding sites (TFBSs). The identification of TFBSs is a crucial problem in computational biology and includes the subtask of predicting the location of known TFBS motifs in...

Descripción completa

Detalles Bibliográficos
Autores principales: Garcia-Alcalde, Fernando, Blanco, Armando, Shepherd, Adrian J
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3098096/
https://www.ncbi.nlm.nih.gov/pubmed/21059262
http://dx.doi.org/10.1186/1471-2105-11-551
_version_ 1782203916581076992
author Garcia-Alcalde, Fernando
Blanco, Armando
Shepherd, Adrian J
author_facet Garcia-Alcalde, Fernando
Blanco, Armando
Shepherd, Adrian J
author_sort Garcia-Alcalde, Fernando
collection PubMed
description BACKGROUND: Transcription factors (TFs) control transcription by binding to specific regions of DNA called transcription factor binding sites (TFBSs). The identification of TFBSs is a crucial problem in computational biology and includes the subtask of predicting the location of known TFBS motifs in a given DNA sequence. It has previously been shown that, when scoring matches to known TFBS motifs, interdependencies between positions within a motif should be taken into account. However, this remains a challenging task owing to the fact that sequences similar to those of known TFBSs can occur by chance with a relatively high frequency. Here we present a new method for matching sequences to TFBS motifs based on intuitionistic fuzzy sets (IFS) theory, an approach that has been shown to be particularly appropriate for tackling problems that embody a high degree of uncertainty. RESULTS: We propose SC(intuit), a new scoring method for measuring sequence-motif affinity based on IFS theory. Unlike existing methods that consider dependencies between positions, SC(intuit )is designed to prevent overestimation of less conserved positions of TFBSs. For a given pair of bases, SC(intuit )is computed not only as a function of their combined probability of occurrence, but also taking into account the individual importance of each single base at its corresponding position. We used SC(intuit )to identify known TFBSs in DNA sequences. Our method provides excellent results when dealing with both synthetic and real data, outperforming the sensitivity and the specificity of two existing methods in all the experiments we performed. CONCLUSIONS: The results show that SC(intuit )improves the prediction quality for TFs of the existing approaches without compromising sensitivity. In addition, we show how SC(intuit )can be successfully applied to real research problems. In this study the reliability of the IFS theory for motif discovery tasks is proven.
format Text
id pubmed-3098096
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30980962011-07-08 An intuitionistic approach to scoring DNA sequences against transcription factor binding site motifs Garcia-Alcalde, Fernando Blanco, Armando Shepherd, Adrian J BMC Bioinformatics Research Article BACKGROUND: Transcription factors (TFs) control transcription by binding to specific regions of DNA called transcription factor binding sites (TFBSs). The identification of TFBSs is a crucial problem in computational biology and includes the subtask of predicting the location of known TFBS motifs in a given DNA sequence. It has previously been shown that, when scoring matches to known TFBS motifs, interdependencies between positions within a motif should be taken into account. However, this remains a challenging task owing to the fact that sequences similar to those of known TFBSs can occur by chance with a relatively high frequency. Here we present a new method for matching sequences to TFBS motifs based on intuitionistic fuzzy sets (IFS) theory, an approach that has been shown to be particularly appropriate for tackling problems that embody a high degree of uncertainty. RESULTS: We propose SC(intuit), a new scoring method for measuring sequence-motif affinity based on IFS theory. Unlike existing methods that consider dependencies between positions, SC(intuit )is designed to prevent overestimation of less conserved positions of TFBSs. For a given pair of bases, SC(intuit )is computed not only as a function of their combined probability of occurrence, but also taking into account the individual importance of each single base at its corresponding position. We used SC(intuit )to identify known TFBSs in DNA sequences. Our method provides excellent results when dealing with both synthetic and real data, outperforming the sensitivity and the specificity of two existing methods in all the experiments we performed. CONCLUSIONS: The results show that SC(intuit )improves the prediction quality for TFs of the existing approaches without compromising sensitivity. In addition, we show how SC(intuit )can be successfully applied to real research problems. In this study the reliability of the IFS theory for motif discovery tasks is proven. BioMed Central 2010-11-08 /pmc/articles/PMC3098096/ /pubmed/21059262 http://dx.doi.org/10.1186/1471-2105-11-551 Text en Copyright ©2010 Garcia-Alcalde et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Garcia-Alcalde, Fernando
Blanco, Armando
Shepherd, Adrian J
An intuitionistic approach to scoring DNA sequences against transcription factor binding site motifs
title An intuitionistic approach to scoring DNA sequences against transcription factor binding site motifs
title_full An intuitionistic approach to scoring DNA sequences against transcription factor binding site motifs
title_fullStr An intuitionistic approach to scoring DNA sequences against transcription factor binding site motifs
title_full_unstemmed An intuitionistic approach to scoring DNA sequences against transcription factor binding site motifs
title_short An intuitionistic approach to scoring DNA sequences against transcription factor binding site motifs
title_sort intuitionistic approach to scoring dna sequences against transcription factor binding site motifs
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3098096/
https://www.ncbi.nlm.nih.gov/pubmed/21059262
http://dx.doi.org/10.1186/1471-2105-11-551
work_keys_str_mv AT garciaalcaldefernando anintuitionisticapproachtoscoringdnasequencesagainsttranscriptionfactorbindingsitemotifs
AT blancoarmando anintuitionisticapproachtoscoringdnasequencesagainsttranscriptionfactorbindingsitemotifs
AT shepherdadrianj anintuitionisticapproachtoscoringdnasequencesagainsttranscriptionfactorbindingsitemotifs
AT garciaalcaldefernando intuitionisticapproachtoscoringdnasequencesagainsttranscriptionfactorbindingsitemotifs
AT blancoarmando intuitionisticapproachtoscoringdnasequencesagainsttranscriptionfactorbindingsitemotifs
AT shepherdadrianj intuitionisticapproachtoscoringdnasequencesagainsttranscriptionfactorbindingsitemotifs