Cargando…

Imogene: identification of motifs and cis-regulatory modules underlying gene co-regulation

Cis-regulatory modules (CRMs) and motifs play a central role in tissue and condition-specific gene expression. Here we present Imogene, an ensemble of statistical tools that we have developed to facilitate their identification and implemented in a publicly available software. Starting from a small t...

Descripción completa

Detalles Bibliográficos
Autores principales: Rouault, Hervé, Santolini, Marc, Schweisguth, François, Hakim, Vincent
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4041412/
https://www.ncbi.nlm.nih.gov/pubmed/24682824
http://dx.doi.org/10.1093/nar/gku209
_version_ 1782318670635073536
author Rouault, Hervé
Santolini, Marc
Schweisguth, François
Hakim, Vincent
author_facet Rouault, Hervé
Santolini, Marc
Schweisguth, François
Hakim, Vincent
author_sort Rouault, Hervé
collection PubMed
description Cis-regulatory modules (CRMs) and motifs play a central role in tissue and condition-specific gene expression. Here we present Imogene, an ensemble of statistical tools that we have developed to facilitate their identification and implemented in a publicly available software. Starting from a small training set of mammalian or fly CRMs that drive similar gene expression profiles, Imogene determines de novo cis-regulatory motifs that underlie this co-expression. It can then predict on a genome-wide scale other CRMs with a regulatory potential similar to the training set. Imogene bypasses the need of large datasets for statistical analyses by making central use of the information provided by the sequenced genomes of multiple species, based on the developed statistical tools and explicit models for transcription factor binding site evolution. We test Imogene on characterized tissue-specific mouse developmental CRMs. Its ability to identify CRMs with the same specificity based on its de novo created motifs is comparable to that of previously evaluated ‘motif-blind’ methods. We further show, both in flies and in mammals, that Imogene de novo generated motifs are sufficient to discriminate CRMs related to different developmental programs. Notably, purely relying on sequence data, Imogene performs as well in this discrimination task as a previously reported learning algorithm based on Chromatin Immunoprecipitation (ChIP) data for multiple transcription factors at multiple developmental stages.
format Online
Article
Text
id pubmed-4041412
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-40414122014-06-11 Imogene: identification of motifs and cis-regulatory modules underlying gene co-regulation Rouault, Hervé Santolini, Marc Schweisguth, François Hakim, Vincent Nucleic Acids Res Computational Biology Cis-regulatory modules (CRMs) and motifs play a central role in tissue and condition-specific gene expression. Here we present Imogene, an ensemble of statistical tools that we have developed to facilitate their identification and implemented in a publicly available software. Starting from a small training set of mammalian or fly CRMs that drive similar gene expression profiles, Imogene determines de novo cis-regulatory motifs that underlie this co-expression. It can then predict on a genome-wide scale other CRMs with a regulatory potential similar to the training set. Imogene bypasses the need of large datasets for statistical analyses by making central use of the information provided by the sequenced genomes of multiple species, based on the developed statistical tools and explicit models for transcription factor binding site evolution. We test Imogene on characterized tissue-specific mouse developmental CRMs. Its ability to identify CRMs with the same specificity based on its de novo created motifs is comparable to that of previously evaluated ‘motif-blind’ methods. We further show, both in flies and in mammals, that Imogene de novo generated motifs are sufficient to discriminate CRMs related to different developmental programs. Notably, purely relying on sequence data, Imogene performs as well in this discrimination task as a previously reported learning algorithm based on Chromatin Immunoprecipitation (ChIP) data for multiple transcription factors at multiple developmental stages. Oxford University Press 2014-06-01 2014-03-25 /pmc/articles/PMC4041412/ /pubmed/24682824 http://dx.doi.org/10.1093/nar/gku209 Text en © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Computational Biology
Rouault, Hervé
Santolini, Marc
Schweisguth, François
Hakim, Vincent
Imogene: identification of motifs and cis-regulatory modules underlying gene co-regulation
title Imogene: identification of motifs and cis-regulatory modules underlying gene co-regulation
title_full Imogene: identification of motifs and cis-regulatory modules underlying gene co-regulation
title_fullStr Imogene: identification of motifs and cis-regulatory modules underlying gene co-regulation
title_full_unstemmed Imogene: identification of motifs and cis-regulatory modules underlying gene co-regulation
title_short Imogene: identification of motifs and cis-regulatory modules underlying gene co-regulation
title_sort imogene: identification of motifs and cis-regulatory modules underlying gene co-regulation
topic Computational Biology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4041412/
https://www.ncbi.nlm.nih.gov/pubmed/24682824
http://dx.doi.org/10.1093/nar/gku209
work_keys_str_mv AT rouaultherve imogeneidentificationofmotifsandcisregulatorymodulesunderlyinggenecoregulation
AT santolinimarc imogeneidentificationofmotifsandcisregulatorymodulesunderlyinggenecoregulation
AT schweisguthfrancois imogeneidentificationofmotifsandcisregulatorymodulesunderlyinggenecoregulation
AT hakimvincent imogeneidentificationofmotifsandcisregulatorymodulesunderlyinggenecoregulation