Cargando…

BicSPAM: flexible biclustering using sequential patterns

BACKGROUND: Biclustering is a critical task for biomedical applications. Order-preserving biclusters, submatrices where the values of rows induce the same linear ordering across columns, capture local regularities with constant, shifting, scaling and sequential assumptions. Additionally, biclusterin...

Descripción completa

Detalles Bibliográficos
Autores principales: Henriques, Rui, Madeira, Sara C
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4071222/
https://www.ncbi.nlm.nih.gov/pubmed/24885271
http://dx.doi.org/10.1186/1471-2105-15-130
_version_ 1782322790148341760
author Henriques, Rui
Madeira, Sara C
author_facet Henriques, Rui
Madeira, Sara C
author_sort Henriques, Rui
collection PubMed
description BACKGROUND: Biclustering is a critical task for biomedical applications. Order-preserving biclusters, submatrices where the values of rows induce the same linear ordering across columns, capture local regularities with constant, shifting, scaling and sequential assumptions. Additionally, biclustering approaches relying on pattern mining output deliver exhaustive solutions with an arbitrary number and positioning of biclusters. However, existing order-preserving approaches suffer from robustness, scalability and/or flexibility issues. Additionally, they are not able to discover biclusters with symmetries and parameterizable levels of noise. RESULTS: We propose new biclustering algorithms to perform flexible, exhaustive and noise-tolerant biclustering based on sequential patterns (BicSPAM). Strategies are proposed to allow for symmetries and to seize efficiency gains from item-indexable properties and/or from partitioning methods with conservative distance guarantees. Results show BicSPAM ability to capture symmetries, handle planted noise, and scale in terms of memory and time. BicSPAM also achieves the best match-scores for the recovery of hidden biclusters in synthetic datasets with varying noise distributions and levels of missing values. Finally, results on gene expression data lead to complete solutions, delivering new biclusters corresponding to putative modules with heightened biological relevance. CONCLUSIONS: BicSPAM provides an exhaustive way to discover flexible structures of order-preserving biclusters. To the best of our knowledge, BicSPAM is the first attempt to deal with order-preserving biclusters that allow for symmetries and that are robust to varying levels of noise.
format Online
Article
Text
id pubmed-4071222
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-40712222014-06-27 BicSPAM: flexible biclustering using sequential patterns Henriques, Rui Madeira, Sara C BMC Bioinformatics Research Article BACKGROUND: Biclustering is a critical task for biomedical applications. Order-preserving biclusters, submatrices where the values of rows induce the same linear ordering across columns, capture local regularities with constant, shifting, scaling and sequential assumptions. Additionally, biclustering approaches relying on pattern mining output deliver exhaustive solutions with an arbitrary number and positioning of biclusters. However, existing order-preserving approaches suffer from robustness, scalability and/or flexibility issues. Additionally, they are not able to discover biclusters with symmetries and parameterizable levels of noise. RESULTS: We propose new biclustering algorithms to perform flexible, exhaustive and noise-tolerant biclustering based on sequential patterns (BicSPAM). Strategies are proposed to allow for symmetries and to seize efficiency gains from item-indexable properties and/or from partitioning methods with conservative distance guarantees. Results show BicSPAM ability to capture symmetries, handle planted noise, and scale in terms of memory and time. BicSPAM also achieves the best match-scores for the recovery of hidden biclusters in synthetic datasets with varying noise distributions and levels of missing values. Finally, results on gene expression data lead to complete solutions, delivering new biclusters corresponding to putative modules with heightened biological relevance. CONCLUSIONS: BicSPAM provides an exhaustive way to discover flexible structures of order-preserving biclusters. To the best of our knowledge, BicSPAM is the first attempt to deal with order-preserving biclusters that allow for symmetries and that are robust to varying levels of noise. BioMed Central 2014-05-06 /pmc/articles/PMC4071222/ /pubmed/24885271 http://dx.doi.org/10.1186/1471-2105-15-130 Text en Copyright © 2014 Henriques and Madeira; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.
spellingShingle Research Article
Henriques, Rui
Madeira, Sara C
BicSPAM: flexible biclustering using sequential patterns
title BicSPAM: flexible biclustering using sequential patterns
title_full BicSPAM: flexible biclustering using sequential patterns
title_fullStr BicSPAM: flexible biclustering using sequential patterns
title_full_unstemmed BicSPAM: flexible biclustering using sequential patterns
title_short BicSPAM: flexible biclustering using sequential patterns
title_sort bicspam: flexible biclustering using sequential patterns
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4071222/
https://www.ncbi.nlm.nih.gov/pubmed/24885271
http://dx.doi.org/10.1186/1471-2105-15-130
work_keys_str_mv AT henriquesrui bicspamflexiblebiclusteringusingsequentialpatterns
AT madeirasarac bicspamflexiblebiclusteringusingsequentialpatterns