Cargando…

Computational annotation of UTR cis-regulatory modules through Frequent Pattern Mining

BACKGROUND: Many studies report about detection and functional characterization of cis-regulatory motifs in untranslated regions (UTRs) of mRNAs but little is known about the nature and functional role of their distribution. To address this issue we have developed a computational approach based on t...

Descripción completa

Detalles Bibliográficos
Autores principales: Turi, Antonio, Loglisci, Corrado, Salvemini, Eliana, Grillo, Giorgio, Malerba, Donato, D'Elia, Domenica
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2697649/
https://www.ncbi.nlm.nih.gov/pubmed/19534751
http://dx.doi.org/10.1186/1471-2105-10-S6-S25
_version_ 1782168348276031488
author Turi, Antonio
Loglisci, Corrado
Salvemini, Eliana
Grillo, Giorgio
Malerba, Donato
D'Elia, Domenica
author_facet Turi, Antonio
Loglisci, Corrado
Salvemini, Eliana
Grillo, Giorgio
Malerba, Donato
D'Elia, Domenica
author_sort Turi, Antonio
collection PubMed
description BACKGROUND: Many studies report about detection and functional characterization of cis-regulatory motifs in untranslated regions (UTRs) of mRNAs but little is known about the nature and functional role of their distribution. To address this issue we have developed a computational approach based on the use of data mining techniques. The idea is that of mining frequent combinations of translation regulatory motifs, since their significant co-occurrences could reveal functional relationships important for the post-transcriptional control of gene expression. The experimentation has been focused on targeted mitochondrial transcripts to elucidate the role of translational control in mitochondrial biogenesis and function. RESULTS: The analysis is based on a two-stepped procedure using a sequential pattern mining algorithm. The first step searches for frequent patterns (FPs) of motifs without taking into account their spatial displacement. In the second step, frequent sequential patterns (FSPs) of spaced motifs are generated by taking into account the conservation of spacers between each ordered pair of co-occurring motifs. The algorithm makes no assumption on the relation among motifs and on the number of motifs involved in a pattern. Different FSPs can be found depending on different combinations of two parameters, i.e. the threshold of the minimum percentage of sequences supporting the pattern, and the granularity of spacer discretization. Results can be retrieved at the UTRminer web site: . The discovered FPs of motifs amount to 216 in the overall dataset and to 140 in the human subset. For each FP, the system provides information on the discovered FSPs, if any. A variety of search options help users in browsing the web resource. The list of sequence IDs supporting each pattern can be used for the retrieval of information from the UTRminer database. CONCLUSION: Computational prediction of structural properties of regulatory sequences is not trivial. The presented data mining approach is able to overcome some limits observed in other competitive tools. Preliminary results on UTR sequences from nuclear transcripts targeting mitochondria are promising and lead us to be confident on the effectiveness of the approach for future developments.
format Text
id pubmed-2697649
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26976492009-06-16 Computational annotation of UTR cis-regulatory modules through Frequent Pattern Mining Turi, Antonio Loglisci, Corrado Salvemini, Eliana Grillo, Giorgio Malerba, Donato D'Elia, Domenica BMC Bioinformatics Proceedings BACKGROUND: Many studies report about detection and functional characterization of cis-regulatory motifs in untranslated regions (UTRs) of mRNAs but little is known about the nature and functional role of their distribution. To address this issue we have developed a computational approach based on the use of data mining techniques. The idea is that of mining frequent combinations of translation regulatory motifs, since their significant co-occurrences could reveal functional relationships important for the post-transcriptional control of gene expression. The experimentation has been focused on targeted mitochondrial transcripts to elucidate the role of translational control in mitochondrial biogenesis and function. RESULTS: The analysis is based on a two-stepped procedure using a sequential pattern mining algorithm. The first step searches for frequent patterns (FPs) of motifs without taking into account their spatial displacement. In the second step, frequent sequential patterns (FSPs) of spaced motifs are generated by taking into account the conservation of spacers between each ordered pair of co-occurring motifs. The algorithm makes no assumption on the relation among motifs and on the number of motifs involved in a pattern. Different FSPs can be found depending on different combinations of two parameters, i.e. the threshold of the minimum percentage of sequences supporting the pattern, and the granularity of spacer discretization. Results can be retrieved at the UTRminer web site: . The discovered FPs of motifs amount to 216 in the overall dataset and to 140 in the human subset. For each FP, the system provides information on the discovered FSPs, if any. A variety of search options help users in browsing the web resource. The list of sequence IDs supporting each pattern can be used for the retrieval of information from the UTRminer database. CONCLUSION: Computational prediction of structural properties of regulatory sequences is not trivial. The presented data mining approach is able to overcome some limits observed in other competitive tools. Preliminary results on UTR sequences from nuclear transcripts targeting mitochondria are promising and lead us to be confident on the effectiveness of the approach for future developments. BioMed Central 2009-06-16 /pmc/articles/PMC2697649/ /pubmed/19534751 http://dx.doi.org/10.1186/1471-2105-10-S6-S25 Text en Copyright © 2009 Turi et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Turi, Antonio
Loglisci, Corrado
Salvemini, Eliana
Grillo, Giorgio
Malerba, Donato
D'Elia, Domenica
Computational annotation of UTR cis-regulatory modules through Frequent Pattern Mining
title Computational annotation of UTR cis-regulatory modules through Frequent Pattern Mining
title_full Computational annotation of UTR cis-regulatory modules through Frequent Pattern Mining
title_fullStr Computational annotation of UTR cis-regulatory modules through Frequent Pattern Mining
title_full_unstemmed Computational annotation of UTR cis-regulatory modules through Frequent Pattern Mining
title_short Computational annotation of UTR cis-regulatory modules through Frequent Pattern Mining
title_sort computational annotation of utr cis-regulatory modules through frequent pattern mining
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2697649/
https://www.ncbi.nlm.nih.gov/pubmed/19534751
http://dx.doi.org/10.1186/1471-2105-10-S6-S25
work_keys_str_mv AT turiantonio computationalannotationofutrcisregulatorymodulesthroughfrequentpatternmining
AT logliscicorrado computationalannotationofutrcisregulatorymodulesthroughfrequentpatternmining
AT salveminieliana computationalannotationofutrcisregulatorymodulesthroughfrequentpatternmining
AT grillogiorgio computationalannotationofutrcisregulatorymodulesthroughfrequentpatternmining
AT malerbadonato computationalannotationofutrcisregulatorymodulesthroughfrequentpatternmining
AT deliadomenica computationalannotationofutrcisregulatorymodulesthroughfrequentpatternmining