Cargando…

Discovery of Regulatory Elements is Improved by a Discriminatory Approach

A major goal in post-genome biology is the complete mapping of the gene regulatory networks for every organism. Identification of regulatory elements is a prerequisite for realizing this ambitious goal. A common problem is finding regulatory patterns in promoters of a group of co-expressed genes, bu...

Descripción completa

Detalles Bibliográficos
Autores principales: Valen, Eivind, Sandelin, Albin, Winther, Ole, Krogh, Anders
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2770120/
https://www.ncbi.nlm.nih.gov/pubmed/19911049
http://dx.doi.org/10.1371/journal.pcbi.1000562
_version_ 1782173631612190720
author Valen, Eivind
Sandelin, Albin
Winther, Ole
Krogh, Anders
author_facet Valen, Eivind
Sandelin, Albin
Winther, Ole
Krogh, Anders
author_sort Valen, Eivind
collection PubMed
description A major goal in post-genome biology is the complete mapping of the gene regulatory networks for every organism. Identification of regulatory elements is a prerequisite for realizing this ambitious goal. A common problem is finding regulatory patterns in promoters of a group of co-expressed genes, but contemporary methods are challenged by the size and diversity of regulatory regions in higher metazoans. Two key issues are the small amount of information contained in a pattern compared to the large promoter regions and the repetitive characteristics of genomic DNA, which both lead to “pattern drowning”. We present a new computational method for identifying transcription factor binding sites in promoters using a discriminatory approach with a large negative set encompassing a significant sample of the promoters from the relevant genome. The sequences are described by a probabilistic model and the most discriminatory motifs are identified by maximizing the probability of the sets given the motif model and prior probabilities of motif occurrences in both sets. Due to the large number of promoters in the negative set, an enhanced suffix array is used to improve speed and performance. Using our method, we demonstrate higher accuracy than the best of contemporary methods, high robustness when extending the length of the input sequences and a strong correlation between our objective function and the correct solution. Using a large background set of real promoters instead of a simplified model leads to higher discriminatory power and markedly reduces the need for repeat masking; a common pre-processing step for other pattern finders.
format Text
id pubmed-2770120
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-27701202009-11-13 Discovery of Regulatory Elements is Improved by a Discriminatory Approach Valen, Eivind Sandelin, Albin Winther, Ole Krogh, Anders PLoS Comput Biol Research Article A major goal in post-genome biology is the complete mapping of the gene regulatory networks for every organism. Identification of regulatory elements is a prerequisite for realizing this ambitious goal. A common problem is finding regulatory patterns in promoters of a group of co-expressed genes, but contemporary methods are challenged by the size and diversity of regulatory regions in higher metazoans. Two key issues are the small amount of information contained in a pattern compared to the large promoter regions and the repetitive characteristics of genomic DNA, which both lead to “pattern drowning”. We present a new computational method for identifying transcription factor binding sites in promoters using a discriminatory approach with a large negative set encompassing a significant sample of the promoters from the relevant genome. The sequences are described by a probabilistic model and the most discriminatory motifs are identified by maximizing the probability of the sets given the motif model and prior probabilities of motif occurrences in both sets. Due to the large number of promoters in the negative set, an enhanced suffix array is used to improve speed and performance. Using our method, we demonstrate higher accuracy than the best of contemporary methods, high robustness when extending the length of the input sequences and a strong correlation between our objective function and the correct solution. Using a large background set of real promoters instead of a simplified model leads to higher discriminatory power and markedly reduces the need for repeat masking; a common pre-processing step for other pattern finders. Public Library of Science 2009-11-13 /pmc/articles/PMC2770120/ /pubmed/19911049 http://dx.doi.org/10.1371/journal.pcbi.1000562 Text en Valen et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Valen, Eivind
Sandelin, Albin
Winther, Ole
Krogh, Anders
Discovery of Regulatory Elements is Improved by a Discriminatory Approach
title Discovery of Regulatory Elements is Improved by a Discriminatory Approach
title_full Discovery of Regulatory Elements is Improved by a Discriminatory Approach
title_fullStr Discovery of Regulatory Elements is Improved by a Discriminatory Approach
title_full_unstemmed Discovery of Regulatory Elements is Improved by a Discriminatory Approach
title_short Discovery of Regulatory Elements is Improved by a Discriminatory Approach
title_sort discovery of regulatory elements is improved by a discriminatory approach
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2770120/
https://www.ncbi.nlm.nih.gov/pubmed/19911049
http://dx.doi.org/10.1371/journal.pcbi.1000562
work_keys_str_mv AT valeneivind discoveryofregulatoryelementsisimprovedbyadiscriminatoryapproach
AT sandelinalbin discoveryofregulatoryelementsisimprovedbyadiscriminatoryapproach
AT wintherole discoveryofregulatoryelementsisimprovedbyadiscriminatoryapproach
AT kroghanders discoveryofregulatoryelementsisimprovedbyadiscriminatoryapproach