Cargando…
CisMiner: Genome-Wide In-Silico Cis-Regulatory Module Prediction by Fuzzy Itemset Mining
Eukaryotic gene control regions are known to be spread throughout non-coding DNA sequences which may appear distant from the gene promoter. Transcription factors are proteins that coordinately bind to these regions at transcription factor binding sites to regulate gene expression. Several tools allo...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4182448/ https://www.ncbi.nlm.nih.gov/pubmed/25268582 http://dx.doi.org/10.1371/journal.pone.0108065 |
_version_ | 1782337529262899200 |
---|---|
author | Navarro, Carmen Lopez, Francisco J. Cano, Carlos Garcia-Alcalde, Fernando Blanco, Armando |
author_facet | Navarro, Carmen Lopez, Francisco J. Cano, Carlos Garcia-Alcalde, Fernando Blanco, Armando |
author_sort | Navarro, Carmen |
collection | PubMed |
description | Eukaryotic gene control regions are known to be spread throughout non-coding DNA sequences which may appear distant from the gene promoter. Transcription factors are proteins that coordinately bind to these regions at transcription factor binding sites to regulate gene expression. Several tools allow to detect significant co-occurrences of closely located binding sites (cis-regulatory modules, CRMs). However, these tools present at least one of the following limitations: 1) scope limited to promoter or conserved regions of the genome; 2) do not allow to identify combinations involving more than two motifs; 3) require prior information about target motifs. In this work we present CisMiner, a novel methodology to detect putative CRMs by means of a fuzzy itemset mining approach able to operate at genome-wide scale. CisMiner allows to perform a blind search of CRMs without any prior information about target CRMs nor limitation in the number of motifs. CisMiner tackles the combinatorial complexity of genome-wide cis-regulatory module extraction using a natural representation of motif combinations as itemsets and applying the Top-Down Fuzzy Frequent- Pattern Tree algorithm to identify significant itemsets. Fuzzy technology allows CisMiner to better handle the imprecision and noise inherent to regulatory processes. Results obtained for a set of well-known binding sites in the S. cerevisiae genome show that our method yields highly reliable predictions. Furthermore, CisMiner was also applied to putative in-silico predicted transcription factor binding sites to identify significant combinations in S. cerevisiae and D. melanogaster, proving that our approach can be further applied genome-wide to more complex genomes. CisMiner is freely accesible at: http://genome2.ugr.es/cisminer. CisMiner can be queried for the results presented in this work and can also perform a customized cis-regulatory module prediction on a query set of transcription factor binding sites provided by the user. |
format | Online Article Text |
id | pubmed-4182448 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-41824482014-10-07 CisMiner: Genome-Wide In-Silico Cis-Regulatory Module Prediction by Fuzzy Itemset Mining Navarro, Carmen Lopez, Francisco J. Cano, Carlos Garcia-Alcalde, Fernando Blanco, Armando PLoS One Research Article Eukaryotic gene control regions are known to be spread throughout non-coding DNA sequences which may appear distant from the gene promoter. Transcription factors are proteins that coordinately bind to these regions at transcription factor binding sites to regulate gene expression. Several tools allow to detect significant co-occurrences of closely located binding sites (cis-regulatory modules, CRMs). However, these tools present at least one of the following limitations: 1) scope limited to promoter or conserved regions of the genome; 2) do not allow to identify combinations involving more than two motifs; 3) require prior information about target motifs. In this work we present CisMiner, a novel methodology to detect putative CRMs by means of a fuzzy itemset mining approach able to operate at genome-wide scale. CisMiner allows to perform a blind search of CRMs without any prior information about target CRMs nor limitation in the number of motifs. CisMiner tackles the combinatorial complexity of genome-wide cis-regulatory module extraction using a natural representation of motif combinations as itemsets and applying the Top-Down Fuzzy Frequent- Pattern Tree algorithm to identify significant itemsets. Fuzzy technology allows CisMiner to better handle the imprecision and noise inherent to regulatory processes. Results obtained for a set of well-known binding sites in the S. cerevisiae genome show that our method yields highly reliable predictions. Furthermore, CisMiner was also applied to putative in-silico predicted transcription factor binding sites to identify significant combinations in S. cerevisiae and D. melanogaster, proving that our approach can be further applied genome-wide to more complex genomes. CisMiner is freely accesible at: http://genome2.ugr.es/cisminer. CisMiner can be queried for the results presented in this work and can also perform a customized cis-regulatory module prediction on a query set of transcription factor binding sites provided by the user. Public Library of Science 2014-09-30 /pmc/articles/PMC4182448/ /pubmed/25268582 http://dx.doi.org/10.1371/journal.pone.0108065 Text en © 2014 Navarro et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Navarro, Carmen Lopez, Francisco J. Cano, Carlos Garcia-Alcalde, Fernando Blanco, Armando CisMiner: Genome-Wide In-Silico Cis-Regulatory Module Prediction by Fuzzy Itemset Mining |
title | CisMiner: Genome-Wide In-Silico Cis-Regulatory Module Prediction by Fuzzy Itemset Mining |
title_full | CisMiner: Genome-Wide In-Silico Cis-Regulatory Module Prediction by Fuzzy Itemset Mining |
title_fullStr | CisMiner: Genome-Wide In-Silico Cis-Regulatory Module Prediction by Fuzzy Itemset Mining |
title_full_unstemmed | CisMiner: Genome-Wide In-Silico Cis-Regulatory Module Prediction by Fuzzy Itemset Mining |
title_short | CisMiner: Genome-Wide In-Silico Cis-Regulatory Module Prediction by Fuzzy Itemset Mining |
title_sort | cisminer: genome-wide in-silico cis-regulatory module prediction by fuzzy itemset mining |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4182448/ https://www.ncbi.nlm.nih.gov/pubmed/25268582 http://dx.doi.org/10.1371/journal.pone.0108065 |
work_keys_str_mv | AT navarrocarmen cisminergenomewideinsilicocisregulatorymodulepredictionbyfuzzyitemsetmining AT lopezfranciscoj cisminergenomewideinsilicocisregulatorymodulepredictionbyfuzzyitemsetmining AT canocarlos cisminergenomewideinsilicocisregulatorymodulepredictionbyfuzzyitemsetmining AT garciaalcaldefernando cisminergenomewideinsilicocisregulatorymodulepredictionbyfuzzyitemsetmining AT blancoarmando cisminergenomewideinsilicocisregulatorymodulepredictionbyfuzzyitemsetmining |