Cargando…
BicSPAM: flexible biclustering using sequential patterns
BACKGROUND: Biclustering is a critical task for biomedical applications. Order-preserving biclusters, submatrices where the values of rows induce the same linear ordering across columns, capture local regularities with constant, shifting, scaling and sequential assumptions. Additionally, biclusterin...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4071222/ https://www.ncbi.nlm.nih.gov/pubmed/24885271 http://dx.doi.org/10.1186/1471-2105-15-130 |
_version_ | 1782322790148341760 |
---|---|
author | Henriques, Rui Madeira, Sara C |
author_facet | Henriques, Rui Madeira, Sara C |
author_sort | Henriques, Rui |
collection | PubMed |
description | BACKGROUND: Biclustering is a critical task for biomedical applications. Order-preserving biclusters, submatrices where the values of rows induce the same linear ordering across columns, capture local regularities with constant, shifting, scaling and sequential assumptions. Additionally, biclustering approaches relying on pattern mining output deliver exhaustive solutions with an arbitrary number and positioning of biclusters. However, existing order-preserving approaches suffer from robustness, scalability and/or flexibility issues. Additionally, they are not able to discover biclusters with symmetries and parameterizable levels of noise. RESULTS: We propose new biclustering algorithms to perform flexible, exhaustive and noise-tolerant biclustering based on sequential patterns (BicSPAM). Strategies are proposed to allow for symmetries and to seize efficiency gains from item-indexable properties and/or from partitioning methods with conservative distance guarantees. Results show BicSPAM ability to capture symmetries, handle planted noise, and scale in terms of memory and time. BicSPAM also achieves the best match-scores for the recovery of hidden biclusters in synthetic datasets with varying noise distributions and levels of missing values. Finally, results on gene expression data lead to complete solutions, delivering new biclusters corresponding to putative modules with heightened biological relevance. CONCLUSIONS: BicSPAM provides an exhaustive way to discover flexible structures of order-preserving biclusters. To the best of our knowledge, BicSPAM is the first attempt to deal with order-preserving biclusters that allow for symmetries and that are robust to varying levels of noise. |
format | Online Article Text |
id | pubmed-4071222 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-40712222014-06-27 BicSPAM: flexible biclustering using sequential patterns Henriques, Rui Madeira, Sara C BMC Bioinformatics Research Article BACKGROUND: Biclustering is a critical task for biomedical applications. Order-preserving biclusters, submatrices where the values of rows induce the same linear ordering across columns, capture local regularities with constant, shifting, scaling and sequential assumptions. Additionally, biclustering approaches relying on pattern mining output deliver exhaustive solutions with an arbitrary number and positioning of biclusters. However, existing order-preserving approaches suffer from robustness, scalability and/or flexibility issues. Additionally, they are not able to discover biclusters with symmetries and parameterizable levels of noise. RESULTS: We propose new biclustering algorithms to perform flexible, exhaustive and noise-tolerant biclustering based on sequential patterns (BicSPAM). Strategies are proposed to allow for symmetries and to seize efficiency gains from item-indexable properties and/or from partitioning methods with conservative distance guarantees. Results show BicSPAM ability to capture symmetries, handle planted noise, and scale in terms of memory and time. BicSPAM also achieves the best match-scores for the recovery of hidden biclusters in synthetic datasets with varying noise distributions and levels of missing values. Finally, results on gene expression data lead to complete solutions, delivering new biclusters corresponding to putative modules with heightened biological relevance. CONCLUSIONS: BicSPAM provides an exhaustive way to discover flexible structures of order-preserving biclusters. To the best of our knowledge, BicSPAM is the first attempt to deal with order-preserving biclusters that allow for symmetries and that are robust to varying levels of noise. BioMed Central 2014-05-06 /pmc/articles/PMC4071222/ /pubmed/24885271 http://dx.doi.org/10.1186/1471-2105-15-130 Text en Copyright © 2014 Henriques and Madeira; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. |
spellingShingle | Research Article Henriques, Rui Madeira, Sara C BicSPAM: flexible biclustering using sequential patterns |
title | BicSPAM: flexible biclustering using sequential patterns |
title_full | BicSPAM: flexible biclustering using sequential patterns |
title_fullStr | BicSPAM: flexible biclustering using sequential patterns |
title_full_unstemmed | BicSPAM: flexible biclustering using sequential patterns |
title_short | BicSPAM: flexible biclustering using sequential patterns |
title_sort | bicspam: flexible biclustering using sequential patterns |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4071222/ https://www.ncbi.nlm.nih.gov/pubmed/24885271 http://dx.doi.org/10.1186/1471-2105-15-130 |
work_keys_str_mv | AT henriquesrui bicspamflexiblebiclusteringusingsequentialpatterns AT madeirasarac bicspamflexiblebiclusteringusingsequentialpatterns |