Cargando…

Integrating microRNA target predictions for the discovery of gene regulatory networks: a semi-supervised ensemble learning approach

BACKGROUND: MicroRNAs (miRNAs) are small non-coding RNAs which play a key role in the post-transcriptional regulation of many genes. Elucidating miRNA-regulated gene networks is crucial for the understanding of mechanisms and functions of miRNAs in many biological processes, such as cell proliferati...

Descripción completa

Detalles Bibliográficos
Autores principales: Pio, Gianvito, Malerba, Donato, D'Elia, Domenica, Ceci, Michelangelo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4015287/
https://www.ncbi.nlm.nih.gov/pubmed/24564296
http://dx.doi.org/10.1186/1471-2105-15-S1-S4
_version_ 1782315311872081920
author Pio, Gianvito
Malerba, Donato
D'Elia, Domenica
Ceci, Michelangelo
author_facet Pio, Gianvito
Malerba, Donato
D'Elia, Domenica
Ceci, Michelangelo
author_sort Pio, Gianvito
collection PubMed
description BACKGROUND: MicroRNAs (miRNAs) are small non-coding RNAs which play a key role in the post-transcriptional regulation of many genes. Elucidating miRNA-regulated gene networks is crucial for the understanding of mechanisms and functions of miRNAs in many biological processes, such as cell proliferation, development, differentiation and cell homeostasis, as well as in many types of human tumors. To this aim, we have recently presented the biclustering method HOCCLUS2, for the discovery of miRNA regulatory networks. Experiments on predicted interactions revealed that the statistical and biological consistency of the obtained networks is negatively affected by the poor reliability of the output of miRNA target prediction algorithms. Recently, some learning approaches have been proposed to learn to combine the outputs of distinct prediction algorithms and improve their accuracy. However, the application of classical supervised learning algorithms presents two challenges: i) the presence of only positive examples in datasets of experimentally verified interactions and ii) unbalanced number of labeled and unlabeled examples. RESULTS: We present a learning algorithm that learns to combine the score returned by several prediction algorithms, by exploiting information conveyed by (only positively labeled/) validated and unlabeled examples of interactions. To face the two related challenges, we resort to a semi-supervised ensemble learning setting. Results obtained using miRTarBase as the set of labeled (positive) interactions and mirDIP as the set of unlabeled interactions show a significant improvement, over competitive approaches, in the quality of the predictions. This solution also improves the effectiveness of HOCCLUS2 in discovering biologically realistic miRNA:mRNA regulatory networks from large-scale prediction data. Using the miR-17-92 gene cluster family as a reference system and comparing results with previous experiments, we find a large increase in the number of significantly enriched biclusters in pathways, consistent with miR-17-92 functions. CONCLUSION: The proposed approach proves to be fundamental for the computational discovery of miRNA regulatory networks from large-scale predictions. This paves the way to the systematic application of HOCCLUS2 for a comprehensive reconstruction of all the possible multiple interactions established by miRNAs in regulating the expression of gene networks, which would be otherwise impossible to reconstruct by considering only experimentally validated interactions.
format Online
Article
Text
id pubmed-4015287
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-40152872014-05-23 Integrating microRNA target predictions for the discovery of gene regulatory networks: a semi-supervised ensemble learning approach Pio, Gianvito Malerba, Donato D'Elia, Domenica Ceci, Michelangelo BMC Bioinformatics Research BACKGROUND: MicroRNAs (miRNAs) are small non-coding RNAs which play a key role in the post-transcriptional regulation of many genes. Elucidating miRNA-regulated gene networks is crucial for the understanding of mechanisms and functions of miRNAs in many biological processes, such as cell proliferation, development, differentiation and cell homeostasis, as well as in many types of human tumors. To this aim, we have recently presented the biclustering method HOCCLUS2, for the discovery of miRNA regulatory networks. Experiments on predicted interactions revealed that the statistical and biological consistency of the obtained networks is negatively affected by the poor reliability of the output of miRNA target prediction algorithms. Recently, some learning approaches have been proposed to learn to combine the outputs of distinct prediction algorithms and improve their accuracy. However, the application of classical supervised learning algorithms presents two challenges: i) the presence of only positive examples in datasets of experimentally verified interactions and ii) unbalanced number of labeled and unlabeled examples. RESULTS: We present a learning algorithm that learns to combine the score returned by several prediction algorithms, by exploiting information conveyed by (only positively labeled/) validated and unlabeled examples of interactions. To face the two related challenges, we resort to a semi-supervised ensemble learning setting. Results obtained using miRTarBase as the set of labeled (positive) interactions and mirDIP as the set of unlabeled interactions show a significant improvement, over competitive approaches, in the quality of the predictions. This solution also improves the effectiveness of HOCCLUS2 in discovering biologically realistic miRNA:mRNA regulatory networks from large-scale prediction data. Using the miR-17-92 gene cluster family as a reference system and comparing results with previous experiments, we find a large increase in the number of significantly enriched biclusters in pathways, consistent with miR-17-92 functions. CONCLUSION: The proposed approach proves to be fundamental for the computational discovery of miRNA regulatory networks from large-scale predictions. This paves the way to the systematic application of HOCCLUS2 for a comprehensive reconstruction of all the possible multiple interactions established by miRNAs in regulating the expression of gene networks, which would be otherwise impossible to reconstruct by considering only experimentally validated interactions. BioMed Central 2014-01-10 /pmc/articles/PMC4015287/ /pubmed/24564296 http://dx.doi.org/10.1186/1471-2105-15-S1-S4 Text en Copyright © 2014 Pio et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Pio, Gianvito
Malerba, Donato
D'Elia, Domenica
Ceci, Michelangelo
Integrating microRNA target predictions for the discovery of gene regulatory networks: a semi-supervised ensemble learning approach
title Integrating microRNA target predictions for the discovery of gene regulatory networks: a semi-supervised ensemble learning approach
title_full Integrating microRNA target predictions for the discovery of gene regulatory networks: a semi-supervised ensemble learning approach
title_fullStr Integrating microRNA target predictions for the discovery of gene regulatory networks: a semi-supervised ensemble learning approach
title_full_unstemmed Integrating microRNA target predictions for the discovery of gene regulatory networks: a semi-supervised ensemble learning approach
title_short Integrating microRNA target predictions for the discovery of gene regulatory networks: a semi-supervised ensemble learning approach
title_sort integrating microrna target predictions for the discovery of gene regulatory networks: a semi-supervised ensemble learning approach
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4015287/
https://www.ncbi.nlm.nih.gov/pubmed/24564296
http://dx.doi.org/10.1186/1471-2105-15-S1-S4
work_keys_str_mv AT piogianvito integratingmicrornatargetpredictionsforthediscoveryofgeneregulatorynetworksasemisupervisedensemblelearningapproach
AT malerbadonato integratingmicrornatargetpredictionsforthediscoveryofgeneregulatorynetworksasemisupervisedensemblelearningapproach
AT deliadomenica integratingmicrornatargetpredictionsforthediscoveryofgeneregulatorynetworksasemisupervisedensemblelearningapproach
AT cecimichelangelo integratingmicrornatargetpredictionsforthediscoveryofgeneregulatorynetworksasemisupervisedensemblelearningapproach