Cargando…

CisMiner: Genome-Wide In-Silico Cis-Regulatory Module Prediction by Fuzzy Itemset Mining

Eukaryotic gene control regions are known to be spread throughout non-coding DNA sequences which may appear distant from the gene promoter. Transcription factors are proteins that coordinately bind to these regions at transcription factor binding sites to regulate gene expression. Several tools allo...

Descripción completa

Detalles Bibliográficos
Autores principales: Navarro, Carmen, Lopez, Francisco J., Cano, Carlos, Garcia-Alcalde, Fernando, Blanco, Armando
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4182448/
https://www.ncbi.nlm.nih.gov/pubmed/25268582
http://dx.doi.org/10.1371/journal.pone.0108065
_version_ 1782337529262899200
author Navarro, Carmen
Lopez, Francisco J.
Cano, Carlos
Garcia-Alcalde, Fernando
Blanco, Armando
author_facet Navarro, Carmen
Lopez, Francisco J.
Cano, Carlos
Garcia-Alcalde, Fernando
Blanco, Armando
author_sort Navarro, Carmen
collection PubMed
description Eukaryotic gene control regions are known to be spread throughout non-coding DNA sequences which may appear distant from the gene promoter. Transcription factors are proteins that coordinately bind to these regions at transcription factor binding sites to regulate gene expression. Several tools allow to detect significant co-occurrences of closely located binding sites (cis-regulatory modules, CRMs). However, these tools present at least one of the following limitations: 1) scope limited to promoter or conserved regions of the genome; 2) do not allow to identify combinations involving more than two motifs; 3) require prior information about target motifs. In this work we present CisMiner, a novel methodology to detect putative CRMs by means of a fuzzy itemset mining approach able to operate at genome-wide scale. CisMiner allows to perform a blind search of CRMs without any prior information about target CRMs nor limitation in the number of motifs. CisMiner tackles the combinatorial complexity of genome-wide cis-regulatory module extraction using a natural representation of motif combinations as itemsets and applying the Top-Down Fuzzy Frequent- Pattern Tree algorithm to identify significant itemsets. Fuzzy technology allows CisMiner to better handle the imprecision and noise inherent to regulatory processes. Results obtained for a set of well-known binding sites in the S. cerevisiae genome show that our method yields highly reliable predictions. Furthermore, CisMiner was also applied to putative in-silico predicted transcription factor binding sites to identify significant combinations in S. cerevisiae and D. melanogaster, proving that our approach can be further applied genome-wide to more complex genomes. CisMiner is freely accesible at: http://genome2.ugr.es/cisminer. CisMiner can be queried for the results presented in this work and can also perform a customized cis-regulatory module prediction on a query set of transcription factor binding sites provided by the user.
format Online
Article
Text
id pubmed-4182448
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-41824482014-10-07 CisMiner: Genome-Wide In-Silico Cis-Regulatory Module Prediction by Fuzzy Itemset Mining Navarro, Carmen Lopez, Francisco J. Cano, Carlos Garcia-Alcalde, Fernando Blanco, Armando PLoS One Research Article Eukaryotic gene control regions are known to be spread throughout non-coding DNA sequences which may appear distant from the gene promoter. Transcription factors are proteins that coordinately bind to these regions at transcription factor binding sites to regulate gene expression. Several tools allow to detect significant co-occurrences of closely located binding sites (cis-regulatory modules, CRMs). However, these tools present at least one of the following limitations: 1) scope limited to promoter or conserved regions of the genome; 2) do not allow to identify combinations involving more than two motifs; 3) require prior information about target motifs. In this work we present CisMiner, a novel methodology to detect putative CRMs by means of a fuzzy itemset mining approach able to operate at genome-wide scale. CisMiner allows to perform a blind search of CRMs without any prior information about target CRMs nor limitation in the number of motifs. CisMiner tackles the combinatorial complexity of genome-wide cis-regulatory module extraction using a natural representation of motif combinations as itemsets and applying the Top-Down Fuzzy Frequent- Pattern Tree algorithm to identify significant itemsets. Fuzzy technology allows CisMiner to better handle the imprecision and noise inherent to regulatory processes. Results obtained for a set of well-known binding sites in the S. cerevisiae genome show that our method yields highly reliable predictions. Furthermore, CisMiner was also applied to putative in-silico predicted transcription factor binding sites to identify significant combinations in S. cerevisiae and D. melanogaster, proving that our approach can be further applied genome-wide to more complex genomes. CisMiner is freely accesible at: http://genome2.ugr.es/cisminer. CisMiner can be queried for the results presented in this work and can also perform a customized cis-regulatory module prediction on a query set of transcription factor binding sites provided by the user. Public Library of Science 2014-09-30 /pmc/articles/PMC4182448/ /pubmed/25268582 http://dx.doi.org/10.1371/journal.pone.0108065 Text en © 2014 Navarro et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Navarro, Carmen
Lopez, Francisco J.
Cano, Carlos
Garcia-Alcalde, Fernando
Blanco, Armando
CisMiner: Genome-Wide In-Silico Cis-Regulatory Module Prediction by Fuzzy Itemset Mining
title CisMiner: Genome-Wide In-Silico Cis-Regulatory Module Prediction by Fuzzy Itemset Mining
title_full CisMiner: Genome-Wide In-Silico Cis-Regulatory Module Prediction by Fuzzy Itemset Mining
title_fullStr CisMiner: Genome-Wide In-Silico Cis-Regulatory Module Prediction by Fuzzy Itemset Mining
title_full_unstemmed CisMiner: Genome-Wide In-Silico Cis-Regulatory Module Prediction by Fuzzy Itemset Mining
title_short CisMiner: Genome-Wide In-Silico Cis-Regulatory Module Prediction by Fuzzy Itemset Mining
title_sort cisminer: genome-wide in-silico cis-regulatory module prediction by fuzzy itemset mining
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4182448/
https://www.ncbi.nlm.nih.gov/pubmed/25268582
http://dx.doi.org/10.1371/journal.pone.0108065
work_keys_str_mv AT navarrocarmen cisminergenomewideinsilicocisregulatorymodulepredictionbyfuzzyitemsetmining
AT lopezfranciscoj cisminergenomewideinsilicocisregulatorymodulepredictionbyfuzzyitemsetmining
AT canocarlos cisminergenomewideinsilicocisregulatorymodulepredictionbyfuzzyitemsetmining
AT garciaalcaldefernando cisminergenomewideinsilicocisregulatorymodulepredictionbyfuzzyitemsetmining
AT blancoarmando cisminergenomewideinsilicocisregulatorymodulepredictionbyfuzzyitemsetmining