Cargando…

SOMEA: self-organizing map based extraction algorithm for DNA motif identification with heterogeneous model

BACKGROUND: Discrimination of transcription factor binding sites (TFBS) from background sequences plays a key role in computational motif discovery. Current clustering based algorithms employ homogeneous model for problem solving, which assumes that motifs and background signals can be equivalently...

Descripción completa

Detalles Bibliográficos
Autores principales: Lee, Nung Kion, Wang, Dianhui
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3044270/
https://www.ncbi.nlm.nih.gov/pubmed/21342545
http://dx.doi.org/10.1186/1471-2105-12-S1-S16
_version_ 1782198707386580992
author Lee, Nung Kion
Wang, Dianhui
author_facet Lee, Nung Kion
Wang, Dianhui
author_sort Lee, Nung Kion
collection PubMed
description BACKGROUND: Discrimination of transcription factor binding sites (TFBS) from background sequences plays a key role in computational motif discovery. Current clustering based algorithms employ homogeneous model for problem solving, which assumes that motifs and background signals can be equivalently characterized. This assumption has some limitations because both sequence signals have distinct properties. RESULTS: This paper aims to develop a Self-Organizing Map (SOM) based clustering algorithm for extracting binding sites in DNA sequences. Our framework is based on a novel intra-node soft competitive procedure to achieve maximum discrimination of motifs from background signals in datasets. The intra-node competition is based on an adaptive weighting technique on two different signal models to better represent these two classes of signals. Using several real and artificial datasets, we compared our proposed method with several motif discovery tools. Compared to SOMBRERO, a state-of-the-art SOM based motif discovery tool, it is found that our algorithm can achieve significant improvements in the average precision rates (i.e., about 27%) on the real datasets without compromising its sensitivity. Our method also performed favourably comparing against other motif discovery tools. CONCLUSIONS: Motif discovery with model based clustering framework should consider the use of heterogeneous model to represent the two classes of signals in DNA sequences. Such heterogeneous model can achieve better signal discrimination compared to the homogeneous model.
format Text
id pubmed-3044270
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30442702011-02-25 SOMEA: self-organizing map based extraction algorithm for DNA motif identification with heterogeneous model Lee, Nung Kion Wang, Dianhui BMC Bioinformatics Research BACKGROUND: Discrimination of transcription factor binding sites (TFBS) from background sequences plays a key role in computational motif discovery. Current clustering based algorithms employ homogeneous model for problem solving, which assumes that motifs and background signals can be equivalently characterized. This assumption has some limitations because both sequence signals have distinct properties. RESULTS: This paper aims to develop a Self-Organizing Map (SOM) based clustering algorithm for extracting binding sites in DNA sequences. Our framework is based on a novel intra-node soft competitive procedure to achieve maximum discrimination of motifs from background signals in datasets. The intra-node competition is based on an adaptive weighting technique on two different signal models to better represent these two classes of signals. Using several real and artificial datasets, we compared our proposed method with several motif discovery tools. Compared to SOMBRERO, a state-of-the-art SOM based motif discovery tool, it is found that our algorithm can achieve significant improvements in the average precision rates (i.e., about 27%) on the real datasets without compromising its sensitivity. Our method also performed favourably comparing against other motif discovery tools. CONCLUSIONS: Motif discovery with model based clustering framework should consider the use of heterogeneous model to represent the two classes of signals in DNA sequences. Such heterogeneous model can achieve better signal discrimination compared to the homogeneous model. BioMed Central 2011-02-15 /pmc/articles/PMC3044270/ /pubmed/21342545 http://dx.doi.org/10.1186/1471-2105-12-S1-S16 Text en Copyright ©2011 Lee and Wang; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Lee, Nung Kion
Wang, Dianhui
SOMEA: self-organizing map based extraction algorithm for DNA motif identification with heterogeneous model
title SOMEA: self-organizing map based extraction algorithm for DNA motif identification with heterogeneous model
title_full SOMEA: self-organizing map based extraction algorithm for DNA motif identification with heterogeneous model
title_fullStr SOMEA: self-organizing map based extraction algorithm for DNA motif identification with heterogeneous model
title_full_unstemmed SOMEA: self-organizing map based extraction algorithm for DNA motif identification with heterogeneous model
title_short SOMEA: self-organizing map based extraction algorithm for DNA motif identification with heterogeneous model
title_sort somea: self-organizing map based extraction algorithm for dna motif identification with heterogeneous model
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3044270/
https://www.ncbi.nlm.nih.gov/pubmed/21342545
http://dx.doi.org/10.1186/1471-2105-12-S1-S16
work_keys_str_mv AT leenungkion someaselforganizingmapbasedextractionalgorithmfordnamotifidentificationwithheterogeneousmodel
AT wangdianhui someaselforganizingmapbasedextractionalgorithmfordnamotifidentificationwithheterogeneousmodel