Cargando…

Identification of Bicluster Regions in a Binary Matrix and Its Applications

Biclustering has emerged as an important approach to the analysis of large-scale datasets. A biclustering technique identifies a subset of rows that exhibit similar patterns on a subset of columns in a data matrix. Many biclustering methods have been proposed, and most, if not all, algorithms are de...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Hung-Chia, Zou, Wen, Tien, Yin-Jing, Chen, James J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3733970/
https://www.ncbi.nlm.nih.gov/pubmed/23940779
http://dx.doi.org/10.1371/journal.pone.0071680
_version_ 1782279444440809472
author Chen, Hung-Chia
Zou, Wen
Tien, Yin-Jing
Chen, James J.
author_facet Chen, Hung-Chia
Zou, Wen
Tien, Yin-Jing
Chen, James J.
author_sort Chen, Hung-Chia
collection PubMed
description Biclustering has emerged as an important approach to the analysis of large-scale datasets. A biclustering technique identifies a subset of rows that exhibit similar patterns on a subset of columns in a data matrix. Many biclustering methods have been proposed, and most, if not all, algorithms are developed to detect regions of “coherence” patterns. These methods perform unsatisfactorily if the purpose is to identify biclusters of a constant level. This paper presents a two-step biclustering method to identify constant level biclusters for binary or quantitative data. This algorithm identifies the maximal dimensional submatrix such that the proportion of non-signals is less than a pre-specified tolerance δ. The proposed method has much higher sensitivity and slightly lower specificity than several prominent biclustering methods from the analysis of two synthetic datasets. It was further compared with the Bimax method for two real datasets. The proposed method was shown to perform the most robust in terms of sensitivity, number of biclusters and number of serotype-specific biclusters identified. However, dichotomization using different signal level thresholds usually leads to different sets of biclusters; this also occurs in the present analysis.
format Online
Article
Text
id pubmed-3733970
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-37339702013-08-12 Identification of Bicluster Regions in a Binary Matrix and Its Applications Chen, Hung-Chia Zou, Wen Tien, Yin-Jing Chen, James J. PLoS One Research Article Biclustering has emerged as an important approach to the analysis of large-scale datasets. A biclustering technique identifies a subset of rows that exhibit similar patterns on a subset of columns in a data matrix. Many biclustering methods have been proposed, and most, if not all, algorithms are developed to detect regions of “coherence” patterns. These methods perform unsatisfactorily if the purpose is to identify biclusters of a constant level. This paper presents a two-step biclustering method to identify constant level biclusters for binary or quantitative data. This algorithm identifies the maximal dimensional submatrix such that the proportion of non-signals is less than a pre-specified tolerance δ. The proposed method has much higher sensitivity and slightly lower specificity than several prominent biclustering methods from the analysis of two synthetic datasets. It was further compared with the Bimax method for two real datasets. The proposed method was shown to perform the most robust in terms of sensitivity, number of biclusters and number of serotype-specific biclusters identified. However, dichotomization using different signal level thresholds usually leads to different sets of biclusters; this also occurs in the present analysis. Public Library of Science 2013-08-05 /pmc/articles/PMC3733970/ /pubmed/23940779 http://dx.doi.org/10.1371/journal.pone.0071680 Text en https://creativecommons.org/publicdomain/zero/1.0/ This is an open-access article distributed under the terms of the Creative Commons Public Domain declaration, which stipulates that, once placed in the public domain, this work may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose.
spellingShingle Research Article
Chen, Hung-Chia
Zou, Wen
Tien, Yin-Jing
Chen, James J.
Identification of Bicluster Regions in a Binary Matrix and Its Applications
title Identification of Bicluster Regions in a Binary Matrix and Its Applications
title_full Identification of Bicluster Regions in a Binary Matrix and Its Applications
title_fullStr Identification of Bicluster Regions in a Binary Matrix and Its Applications
title_full_unstemmed Identification of Bicluster Regions in a Binary Matrix and Its Applications
title_short Identification of Bicluster Regions in a Binary Matrix and Its Applications
title_sort identification of bicluster regions in a binary matrix and its applications
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3733970/
https://www.ncbi.nlm.nih.gov/pubmed/23940779
http://dx.doi.org/10.1371/journal.pone.0071680
work_keys_str_mv AT chenhungchia identificationofbiclusterregionsinabinarymatrixanditsapplications
AT zouwen identificationofbiclusterregionsinabinarymatrixanditsapplications
AT tienyinjing identificationofbiclusterregionsinabinarymatrixanditsapplications
AT chenjamesj identificationofbiclusterregionsinabinarymatrixanditsapplications