Cargando…
BiC2PAM: constraint-guided biclustering for biological data analysis with domain knowledge
BACKGROUND: Biclustering has been largely used in biological data analysis, enabling the discovery of putative functional modules from omic and network data. Despite the recognized importance of incorporating domain knowledge to guide biclustering and guarantee a focus on relevant and non-trivial bi...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5024481/ https://www.ncbi.nlm.nih.gov/pubmed/27651825 http://dx.doi.org/10.1186/s13015-016-0085-5 |
_version_ | 1782453809267605504 |
---|---|
author | Henriques, Rui Madeira, Sara C. |
author_facet | Henriques, Rui Madeira, Sara C. |
author_sort | Henriques, Rui |
collection | PubMed |
description | BACKGROUND: Biclustering has been largely used in biological data analysis, enabling the discovery of putative functional modules from omic and network data. Despite the recognized importance of incorporating domain knowledge to guide biclustering and guarantee a focus on relevant and non-trivial biclusters, this possibility has not yet been comprehensively addressed. This results from the fact that the majority of existing algorithms are only able to deliver sub-optimal solutions with restrictive assumptions on the structure, coherency and quality of biclustering solutions, thus preventing the up-front satisfaction of knowledge-driven constraints. Interestingly, in recent years, a clearer understanding of the synergies between pattern mining and biclustering gave rise to a new class of algorithms, termed as pattern-based biclustering algorithms. These algorithms, able to efficiently discover flexible biclustering solutions with optimality guarantees, are thus positioned as good candidates for knowledge incorporation. In this context, this work aims to bridge the current lack of solid views on the use of background knowledge to guide (pattern-based) biclustering tasks. METHODS: This work extends (pattern-based) biclustering algorithms to guarantee the satisfiability of constraints derived from background knowledge and to effectively explore efficiency gains from their incorporation. In this context, we first show the relevance of constraints with succinct, (anti-)monotone and convertible properties for the analysis of expression data and biological networks. We further show how pattern-based biclustering algorithms can be adapted to effectively prune of the search space in the presence of such constraints, as well as be guided in the presence of biological annotations. Relying on these contributions, we propose BiClustering with Constraints using PAttern Mining (BiC2PAM), an extension of BicPAM and BicNET biclustering algorithms. RESULTS: Experimental results on biological data demonstrate the importance of incorporating knowledge within biclustering to foster efficiency and enable the discovery of non-trivial biclusters with heightened biological relevance. CONCLUSIONS: This work provides the first comprehensive view and sound algorithm for biclustering biological data with constraints derived from user expectations, knowledge repositories and/or literature. |
format | Online Article Text |
id | pubmed-5024481 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-50244812016-09-20 BiC2PAM: constraint-guided biclustering for biological data analysis with domain knowledge Henriques, Rui Madeira, Sara C. Algorithms Mol Biol Research BACKGROUND: Biclustering has been largely used in biological data analysis, enabling the discovery of putative functional modules from omic and network data. Despite the recognized importance of incorporating domain knowledge to guide biclustering and guarantee a focus on relevant and non-trivial biclusters, this possibility has not yet been comprehensively addressed. This results from the fact that the majority of existing algorithms are only able to deliver sub-optimal solutions with restrictive assumptions on the structure, coherency and quality of biclustering solutions, thus preventing the up-front satisfaction of knowledge-driven constraints. Interestingly, in recent years, a clearer understanding of the synergies between pattern mining and biclustering gave rise to a new class of algorithms, termed as pattern-based biclustering algorithms. These algorithms, able to efficiently discover flexible biclustering solutions with optimality guarantees, are thus positioned as good candidates for knowledge incorporation. In this context, this work aims to bridge the current lack of solid views on the use of background knowledge to guide (pattern-based) biclustering tasks. METHODS: This work extends (pattern-based) biclustering algorithms to guarantee the satisfiability of constraints derived from background knowledge and to effectively explore efficiency gains from their incorporation. In this context, we first show the relevance of constraints with succinct, (anti-)monotone and convertible properties for the analysis of expression data and biological networks. We further show how pattern-based biclustering algorithms can be adapted to effectively prune of the search space in the presence of such constraints, as well as be guided in the presence of biological annotations. Relying on these contributions, we propose BiClustering with Constraints using PAttern Mining (BiC2PAM), an extension of BicPAM and BicNET biclustering algorithms. RESULTS: Experimental results on biological data demonstrate the importance of incorporating knowledge within biclustering to foster efficiency and enable the discovery of non-trivial biclusters with heightened biological relevance. CONCLUSIONS: This work provides the first comprehensive view and sound algorithm for biclustering biological data with constraints derived from user expectations, knowledge repositories and/or literature. BioMed Central 2016-09-14 /pmc/articles/PMC5024481/ /pubmed/27651825 http://dx.doi.org/10.1186/s13015-016-0085-5 Text en © The Author(s) 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Henriques, Rui Madeira, Sara C. BiC2PAM: constraint-guided biclustering for biological data analysis with domain knowledge |
title | BiC2PAM: constraint-guided biclustering for biological data analysis with domain knowledge |
title_full | BiC2PAM: constraint-guided biclustering for biological data analysis with domain knowledge |
title_fullStr | BiC2PAM: constraint-guided biclustering for biological data analysis with domain knowledge |
title_full_unstemmed | BiC2PAM: constraint-guided biclustering for biological data analysis with domain knowledge |
title_short | BiC2PAM: constraint-guided biclustering for biological data analysis with domain knowledge |
title_sort | bic2pam: constraint-guided biclustering for biological data analysis with domain knowledge |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5024481/ https://www.ncbi.nlm.nih.gov/pubmed/27651825 http://dx.doi.org/10.1186/s13015-016-0085-5 |
work_keys_str_mv | AT henriquesrui bic2pamconstraintguidedbiclusteringforbiologicaldataanalysiswithdomainknowledge AT madeirasarac bic2pamconstraintguidedbiclusteringforbiologicaldataanalysiswithdomainknowledge |