Cargando…

A particle swarm optimization-based algorithm for finding gapped motifs

BACKGROUND: Identifying approximately repeated patterns, or motifs, in DNA sequences from a set of co-regulated genes is an important step towards deciphering the complex gene regulatory networks and understanding gene functions. RESULTS: In this work, we develop a novel motif finding algorithm (PSO...

Descripción completa

Detalles Bibliográficos
Autores principales: Lei, Chengwei, Ruan, Jianhua
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3022572/
https://www.ncbi.nlm.nih.gov/pubmed/21144057
http://dx.doi.org/10.1186/1756-0381-3-9
_version_ 1782196520799436800
author Lei, Chengwei
Ruan, Jianhua
author_facet Lei, Chengwei
Ruan, Jianhua
author_sort Lei, Chengwei
collection PubMed
description BACKGROUND: Identifying approximately repeated patterns, or motifs, in DNA sequences from a set of co-regulated genes is an important step towards deciphering the complex gene regulatory networks and understanding gene functions. RESULTS: In this work, we develop a novel motif finding algorithm (PSO+) using a population-based stochastic optimization technique called Particle Swarm Optimization (PSO), which has been shown to be effective in optimizing difficult multidimensional problems in continuous domains. We propose a modification of the standard PSO algorithm to handle discrete values, such as characters in DNA sequences. The algorithm provides several features. First, we use both consensus and position-specific weight matrix representations in our algorithm, taking advantage of the efficiency of the former and the accuracy of the latter. Furthermore, many real motifs contain gaps, but the existing methods usually ignore them or assume a user know their exact locations and lengths, which is usually impractical for real applications. In comparison, our method models gaps explicitly, and provides an easy solution to find gapped motifs without any detailed knowledge of gaps. Our method allows the presence of input sequences containing zero or multiple binding sites. CONCLUSION: Experimental results on synthetic challenge problems as well as real biological sequences show that our method is both more efficient and more accurate than several existing algorithms, especially when gaps are present in the motifs.
format Text
id pubmed-3022572
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30225722011-01-21 A particle swarm optimization-based algorithm for finding gapped motifs Lei, Chengwei Ruan, Jianhua BioData Min Research BACKGROUND: Identifying approximately repeated patterns, or motifs, in DNA sequences from a set of co-regulated genes is an important step towards deciphering the complex gene regulatory networks and understanding gene functions. RESULTS: In this work, we develop a novel motif finding algorithm (PSO+) using a population-based stochastic optimization technique called Particle Swarm Optimization (PSO), which has been shown to be effective in optimizing difficult multidimensional problems in continuous domains. We propose a modification of the standard PSO algorithm to handle discrete values, such as characters in DNA sequences. The algorithm provides several features. First, we use both consensus and position-specific weight matrix representations in our algorithm, taking advantage of the efficiency of the former and the accuracy of the latter. Furthermore, many real motifs contain gaps, but the existing methods usually ignore them or assume a user know their exact locations and lengths, which is usually impractical for real applications. In comparison, our method models gaps explicitly, and provides an easy solution to find gapped motifs without any detailed knowledge of gaps. Our method allows the presence of input sequences containing zero or multiple binding sites. CONCLUSION: Experimental results on synthetic challenge problems as well as real biological sequences show that our method is both more efficient and more accurate than several existing algorithms, especially when gaps are present in the motifs. BioMed Central 2010-12-13 /pmc/articles/PMC3022572/ /pubmed/21144057 http://dx.doi.org/10.1186/1756-0381-3-9 Text en Copyright ©2010 Lei and Ruan; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Lei, Chengwei
Ruan, Jianhua
A particle swarm optimization-based algorithm for finding gapped motifs
title A particle swarm optimization-based algorithm for finding gapped motifs
title_full A particle swarm optimization-based algorithm for finding gapped motifs
title_fullStr A particle swarm optimization-based algorithm for finding gapped motifs
title_full_unstemmed A particle swarm optimization-based algorithm for finding gapped motifs
title_short A particle swarm optimization-based algorithm for finding gapped motifs
title_sort particle swarm optimization-based algorithm for finding gapped motifs
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3022572/
https://www.ncbi.nlm.nih.gov/pubmed/21144057
http://dx.doi.org/10.1186/1756-0381-3-9
work_keys_str_mv AT leichengwei aparticleswarmoptimizationbasedalgorithmforfindinggappedmotifs
AT ruanjianhua aparticleswarmoptimizationbasedalgorithmforfindinggappedmotifs
AT leichengwei particleswarmoptimizationbasedalgorithmforfindinggappedmotifs
AT ruanjianhua particleswarmoptimizationbasedalgorithmforfindinggappedmotifs