Cargando…

A niched Pareto genetic algorithm for finding variable length regulatory motifs in DNA sequences

The transcription factor binding sites also called as motifs are short, recurring patterns in DNA sequences that are presumed to have a biological function. Identification of the motifs from the promoter region of the genes is an important and unsolved problem specifically in the eukaryotic genomes....

Descripción completa

Detalles Bibliográficos
Autores principales: Vijayvargiya, Shripal, Shukla, Pratyoosh
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer Berlin Heidelberg 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3376862/
http://dx.doi.org/10.1007/s13205-011-0040-6
_version_ 1782235885084868608
author Vijayvargiya, Shripal
Shukla, Pratyoosh
author_facet Vijayvargiya, Shripal
Shukla, Pratyoosh
author_sort Vijayvargiya, Shripal
collection PubMed
description The transcription factor binding sites also called as motifs are short, recurring patterns in DNA sequences that are presumed to have a biological function. Identification of the motifs from the promoter region of the genes is an important and unsolved problem specifically in the eukaryotic genomes. In this paper, we present a niched Pareto genetic algorithm to identify the regulatory motifs. This approach is based on the maximization of two objectives of the problem that is the motif length and the consensus similarity score. A long motif means it is less likely to be a false motif. The similarity score represents a motifs probability of conservation in a given set of sequences. Proposed method can find multiple, variable length motifs. In this method, we represented a candidate motif as a combination of length and starting position of the motif in each sequence of the co-regulated genes. This enables the algorithm to identify multiple motifs of variable length. We applied this approach on various data sets and the results show that it can find multiple motifs of variable length in co-regulated genes.
format Online
Article
Text
id pubmed-3376862
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Springer Berlin Heidelberg
record_format MEDLINE/PubMed
spelling pubmed-33768622012-09-11 A niched Pareto genetic algorithm for finding variable length regulatory motifs in DNA sequences Vijayvargiya, Shripal Shukla, Pratyoosh 3 Biotech Original Article The transcription factor binding sites also called as motifs are short, recurring patterns in DNA sequences that are presumed to have a biological function. Identification of the motifs from the promoter region of the genes is an important and unsolved problem specifically in the eukaryotic genomes. In this paper, we present a niched Pareto genetic algorithm to identify the regulatory motifs. This approach is based on the maximization of two objectives of the problem that is the motif length and the consensus similarity score. A long motif means it is less likely to be a false motif. The similarity score represents a motifs probability of conservation in a given set of sequences. Proposed method can find multiple, variable length motifs. In this method, we represented a candidate motif as a combination of length and starting position of the motif in each sequence of the co-regulated genes. This enables the algorithm to identify multiple motifs of variable length. We applied this approach on various data sets and the results show that it can find multiple motifs of variable length in co-regulated genes. Springer Berlin Heidelberg 2011-12-09 2012-06 /pmc/articles/PMC3376862/ http://dx.doi.org/10.1007/s13205-011-0040-6 Text en © The Author(s) 2011 https://creativecommons.org/licenses/by/4.0/ This article is published under license to BioMed Central Ltd. Open AccessThis article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.
spellingShingle Original Article
Vijayvargiya, Shripal
Shukla, Pratyoosh
A niched Pareto genetic algorithm for finding variable length regulatory motifs in DNA sequences
title A niched Pareto genetic algorithm for finding variable length regulatory motifs in DNA sequences
title_full A niched Pareto genetic algorithm for finding variable length regulatory motifs in DNA sequences
title_fullStr A niched Pareto genetic algorithm for finding variable length regulatory motifs in DNA sequences
title_full_unstemmed A niched Pareto genetic algorithm for finding variable length regulatory motifs in DNA sequences
title_short A niched Pareto genetic algorithm for finding variable length regulatory motifs in DNA sequences
title_sort niched pareto genetic algorithm for finding variable length regulatory motifs in dna sequences
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3376862/
http://dx.doi.org/10.1007/s13205-011-0040-6
work_keys_str_mv AT vijayvargiyashripal anichedparetogeneticalgorithmforfindingvariablelengthregulatorymotifsindnasequences
AT shuklapratyoosh anichedparetogeneticalgorithmforfindingvariablelengthregulatorymotifsindnasequences
AT vijayvargiyashripal nichedparetogeneticalgorithmforfindingvariablelengthregulatorymotifsindnasequences
AT shuklapratyoosh nichedparetogeneticalgorithmforfindingvariablelengthregulatorymotifsindnasequences