Cargando…

BiC2PAM: constraint-guided biclustering for biological data analysis with domain knowledge

BACKGROUND: Biclustering has been largely used in biological data analysis, enabling the discovery of putative functional modules from omic and network data. Despite the recognized importance of incorporating domain knowledge to guide biclustering and guarantee a focus on relevant and non-trivial bi...

Descripción completa

Detalles Bibliográficos
Autores principales: Henriques, Rui, Madeira, Sara C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5024481/
https://www.ncbi.nlm.nih.gov/pubmed/27651825
http://dx.doi.org/10.1186/s13015-016-0085-5
_version_ 1782453809267605504
author Henriques, Rui
Madeira, Sara C.
author_facet Henriques, Rui
Madeira, Sara C.
author_sort Henriques, Rui
collection PubMed
description BACKGROUND: Biclustering has been largely used in biological data analysis, enabling the discovery of putative functional modules from omic and network data. Despite the recognized importance of incorporating domain knowledge to guide biclustering and guarantee a focus on relevant and non-trivial biclusters, this possibility has not yet been comprehensively addressed. This results from the fact that the majority of existing algorithms are only able to deliver sub-optimal solutions with restrictive assumptions on the structure, coherency and quality of biclustering solutions, thus preventing the up-front satisfaction of knowledge-driven constraints. Interestingly, in recent years, a clearer understanding of the synergies between pattern mining and biclustering gave rise to a new class of algorithms, termed as pattern-based biclustering algorithms. These algorithms, able to efficiently discover flexible biclustering solutions with optimality guarantees, are thus positioned as good candidates for knowledge incorporation. In this context, this work aims to bridge the current lack of solid views on the use of background knowledge to guide (pattern-based) biclustering tasks. METHODS: This work extends (pattern-based) biclustering algorithms to guarantee the satisfiability of constraints derived from background knowledge and to effectively explore efficiency gains from their incorporation. In this context, we first show the relevance of constraints with succinct, (anti-)monotone and convertible properties for the analysis of expression data and biological networks. We further show how pattern-based biclustering algorithms can be adapted to effectively prune of the search space in the presence of such constraints, as well as be guided in the presence of biological annotations. Relying on these contributions, we propose BiClustering with Constraints using PAttern Mining (BiC2PAM), an extension of BicPAM and BicNET biclustering algorithms. RESULTS: Experimental results on biological data demonstrate the importance of incorporating knowledge within biclustering to foster efficiency and enable the discovery of non-trivial biclusters with heightened biological relevance. CONCLUSIONS: This work provides the first comprehensive view and sound algorithm for biclustering biological data with constraints derived from user expectations, knowledge repositories and/or literature.
format Online
Article
Text
id pubmed-5024481
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-50244812016-09-20 BiC2PAM: constraint-guided biclustering for biological data analysis with domain knowledge Henriques, Rui Madeira, Sara C. Algorithms Mol Biol Research BACKGROUND: Biclustering has been largely used in biological data analysis, enabling the discovery of putative functional modules from omic and network data. Despite the recognized importance of incorporating domain knowledge to guide biclustering and guarantee a focus on relevant and non-trivial biclusters, this possibility has not yet been comprehensively addressed. This results from the fact that the majority of existing algorithms are only able to deliver sub-optimal solutions with restrictive assumptions on the structure, coherency and quality of biclustering solutions, thus preventing the up-front satisfaction of knowledge-driven constraints. Interestingly, in recent years, a clearer understanding of the synergies between pattern mining and biclustering gave rise to a new class of algorithms, termed as pattern-based biclustering algorithms. These algorithms, able to efficiently discover flexible biclustering solutions with optimality guarantees, are thus positioned as good candidates for knowledge incorporation. In this context, this work aims to bridge the current lack of solid views on the use of background knowledge to guide (pattern-based) biclustering tasks. METHODS: This work extends (pattern-based) biclustering algorithms to guarantee the satisfiability of constraints derived from background knowledge and to effectively explore efficiency gains from their incorporation. In this context, we first show the relevance of constraints with succinct, (anti-)monotone and convertible properties for the analysis of expression data and biological networks. We further show how pattern-based biclustering algorithms can be adapted to effectively prune of the search space in the presence of such constraints, as well as be guided in the presence of biological annotations. Relying on these contributions, we propose BiClustering with Constraints using PAttern Mining (BiC2PAM), an extension of BicPAM and BicNET biclustering algorithms. RESULTS: Experimental results on biological data demonstrate the importance of incorporating knowledge within biclustering to foster efficiency and enable the discovery of non-trivial biclusters with heightened biological relevance. CONCLUSIONS: This work provides the first comprehensive view and sound algorithm for biclustering biological data with constraints derived from user expectations, knowledge repositories and/or literature. BioMed Central 2016-09-14 /pmc/articles/PMC5024481/ /pubmed/27651825 http://dx.doi.org/10.1186/s13015-016-0085-5 Text en © The Author(s) 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Henriques, Rui
Madeira, Sara C.
BiC2PAM: constraint-guided biclustering for biological data analysis with domain knowledge
title BiC2PAM: constraint-guided biclustering for biological data analysis with domain knowledge
title_full BiC2PAM: constraint-guided biclustering for biological data analysis with domain knowledge
title_fullStr BiC2PAM: constraint-guided biclustering for biological data analysis with domain knowledge
title_full_unstemmed BiC2PAM: constraint-guided biclustering for biological data analysis with domain knowledge
title_short BiC2PAM: constraint-guided biclustering for biological data analysis with domain knowledge
title_sort bic2pam: constraint-guided biclustering for biological data analysis with domain knowledge
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5024481/
https://www.ncbi.nlm.nih.gov/pubmed/27651825
http://dx.doi.org/10.1186/s13015-016-0085-5
work_keys_str_mv AT henriquesrui bic2pamconstraintguidedbiclusteringforbiologicaldataanalysiswithdomainknowledge
AT madeirasarac bic2pamconstraintguidedbiclusteringforbiologicaldataanalysiswithdomainknowledge