Cargando…

Finding regulatory DNA motifs using alignment-free evolutionary conservation information

As an increasing number of eukaryotic genomes are being sequenced, comparative studies aimed at detecting regulatory elements in intergenic sequences are becoming more prevalent. Most comparative methods for transcription factor (TF) binding site discovery make use of global or local alignments of o...

Descripción completa

Detalles Bibliográficos
Autores principales: Gordân, Raluca, Narlikar, Leelavati, Hartemink, Alexander J.
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2847231/
https://www.ncbi.nlm.nih.gov/pubmed/20047961
http://dx.doi.org/10.1093/nar/gkp1166
_version_ 1782179554082684928
author Gordân, Raluca
Narlikar, Leelavati
Hartemink, Alexander J.
author_facet Gordân, Raluca
Narlikar, Leelavati
Hartemink, Alexander J.
author_sort Gordân, Raluca
collection PubMed
description As an increasing number of eukaryotic genomes are being sequenced, comparative studies aimed at detecting regulatory elements in intergenic sequences are becoming more prevalent. Most comparative methods for transcription factor (TF) binding site discovery make use of global or local alignments of orthologous regulatory regions to assess whether a particular DNA site is conserved across related organisms, and thus more likely to be functional. Since binding sites are usually short, sometimes degenerate, and often independent of orientation, alignment algorithms may not align them correctly. Here, we present a novel, alignment-free approach for using conservation information for TF binding site discovery. We relax the definition of conserved sites: we consider a DNA site within a regulatory region to be conserved in an orthologous sequence if it occurs anywhere in that sequence, irrespective of orientation. We use this definition to derive informative priors over DNA sequence positions, and incorporate these priors into a Gibbs sampling algorithm for motif discovery. Our approach is simple and fast. It requires neither sequence alignments nor the phylogenetic relationships between the orthologous sequences, yet it is more effective on real biological data than methods that do.
format Text
id pubmed-2847231
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-28472312010-04-01 Finding regulatory DNA motifs using alignment-free evolutionary conservation information Gordân, Raluca Narlikar, Leelavati Hartemink, Alexander J. Nucleic Acids Res Methods Online As an increasing number of eukaryotic genomes are being sequenced, comparative studies aimed at detecting regulatory elements in intergenic sequences are becoming more prevalent. Most comparative methods for transcription factor (TF) binding site discovery make use of global or local alignments of orthologous regulatory regions to assess whether a particular DNA site is conserved across related organisms, and thus more likely to be functional. Since binding sites are usually short, sometimes degenerate, and often independent of orientation, alignment algorithms may not align them correctly. Here, we present a novel, alignment-free approach for using conservation information for TF binding site discovery. We relax the definition of conserved sites: we consider a DNA site within a regulatory region to be conserved in an orthologous sequence if it occurs anywhere in that sequence, irrespective of orientation. We use this definition to derive informative priors over DNA sequence positions, and incorporate these priors into a Gibbs sampling algorithm for motif discovery. Our approach is simple and fast. It requires neither sequence alignments nor the phylogenetic relationships between the orthologous sequences, yet it is more effective on real biological data than methods that do. Oxford University Press 2010-04 2010-01-04 /pmc/articles/PMC2847231/ /pubmed/20047961 http://dx.doi.org/10.1093/nar/gkp1166 Text en © The Author(s) 2010. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.5 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Online
Gordân, Raluca
Narlikar, Leelavati
Hartemink, Alexander J.
Finding regulatory DNA motifs using alignment-free evolutionary conservation information
title Finding regulatory DNA motifs using alignment-free evolutionary conservation information
title_full Finding regulatory DNA motifs using alignment-free evolutionary conservation information
title_fullStr Finding regulatory DNA motifs using alignment-free evolutionary conservation information
title_full_unstemmed Finding regulatory DNA motifs using alignment-free evolutionary conservation information
title_short Finding regulatory DNA motifs using alignment-free evolutionary conservation information
title_sort finding regulatory dna motifs using alignment-free evolutionary conservation information
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2847231/
https://www.ncbi.nlm.nih.gov/pubmed/20047961
http://dx.doi.org/10.1093/nar/gkp1166
work_keys_str_mv AT gordanraluca findingregulatorydnamotifsusingalignmentfreeevolutionaryconservationinformation
AT narlikarleelavati findingregulatorydnamotifsusingalignmentfreeevolutionaryconservationinformation
AT harteminkalexanderj findingregulatorydnamotifsusingalignmentfreeevolutionaryconservationinformation