Cargando…
ARBic: an all-round biclustering algorithm for analyzing gene expression data
Identifying significant biclusters of genes with specific expression patterns is an effective approach to reveal functionally correlated genes in gene expression data. However, none of existing algorithms can simultaneously identify both broader and narrower biclusters due to their failure of balanc...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9887595/ https://www.ncbi.nlm.nih.gov/pubmed/36733402 http://dx.doi.org/10.1093/nargab/lqad009 |
_version_ | 1784880373779398656 |
---|---|
author | Liu, Xiangyu Yu, Ting Zhao, Xiaoyu Long, Chaoyi Han, Renmin Su, Zhengchang Li, Guojun |
author_facet | Liu, Xiangyu Yu, Ting Zhao, Xiaoyu Long, Chaoyi Han, Renmin Su, Zhengchang Li, Guojun |
author_sort | Liu, Xiangyu |
collection | PubMed |
description | Identifying significant biclusters of genes with specific expression patterns is an effective approach to reveal functionally correlated genes in gene expression data. However, none of existing algorithms can simultaneously identify both broader and narrower biclusters due to their failure of balancing between effectiveness and efficiency. We introduced ARBic, an algorithm which is capable of accurately identifying any significant biclusters of any shape, including broader, narrower and square, in any large scale gene expression dataset. ARBic was designed by integrating column-based and row-based strategies into a single biclustering procedure. The column-based strategy borrowed from RecBic, a recently published biclustering tool, extracts narrower biclusters, while the row-based strategy that iteratively finds the longest path in a specific directed graph, extracts broader ones. Being tested and compared to other seven salient biclustering algorithms on simulated datasets, ARBic achieves at least an average of 29% higher recovery, relevance and [Formula: see text] scores than the best existing tool. In addition, ARBic substantially outperforms all tools on real datasets and is more robust to noises, bicluster shapes and dataset types. |
format | Online Article Text |
id | pubmed-9887595 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-98875952023-02-01 ARBic: an all-round biclustering algorithm for analyzing gene expression data Liu, Xiangyu Yu, Ting Zhao, Xiaoyu Long, Chaoyi Han, Renmin Su, Zhengchang Li, Guojun NAR Genom Bioinform Methods Article Identifying significant biclusters of genes with specific expression patterns is an effective approach to reveal functionally correlated genes in gene expression data. However, none of existing algorithms can simultaneously identify both broader and narrower biclusters due to their failure of balancing between effectiveness and efficiency. We introduced ARBic, an algorithm which is capable of accurately identifying any significant biclusters of any shape, including broader, narrower and square, in any large scale gene expression dataset. ARBic was designed by integrating column-based and row-based strategies into a single biclustering procedure. The column-based strategy borrowed from RecBic, a recently published biclustering tool, extracts narrower biclusters, while the row-based strategy that iteratively finds the longest path in a specific directed graph, extracts broader ones. Being tested and compared to other seven salient biclustering algorithms on simulated datasets, ARBic achieves at least an average of 29% higher recovery, relevance and [Formula: see text] scores than the best existing tool. In addition, ARBic substantially outperforms all tools on real datasets and is more robust to noises, bicluster shapes and dataset types. Oxford University Press 2023-01-31 /pmc/articles/PMC9887595/ /pubmed/36733402 http://dx.doi.org/10.1093/nargab/lqad009 Text en © The Author(s) 2023. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methods Article Liu, Xiangyu Yu, Ting Zhao, Xiaoyu Long, Chaoyi Han, Renmin Su, Zhengchang Li, Guojun ARBic: an all-round biclustering algorithm for analyzing gene expression data |
title | ARBic: an all-round biclustering algorithm for analyzing gene expression data |
title_full | ARBic: an all-round biclustering algorithm for analyzing gene expression data |
title_fullStr | ARBic: an all-round biclustering algorithm for analyzing gene expression data |
title_full_unstemmed | ARBic: an all-round biclustering algorithm for analyzing gene expression data |
title_short | ARBic: an all-round biclustering algorithm for analyzing gene expression data |
title_sort | arbic: an all-round biclustering algorithm for analyzing gene expression data |
topic | Methods Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9887595/ https://www.ncbi.nlm.nih.gov/pubmed/36733402 http://dx.doi.org/10.1093/nargab/lqad009 |
work_keys_str_mv | AT liuxiangyu arbicanallroundbiclusteringalgorithmforanalyzinggeneexpressiondata AT yuting arbicanallroundbiclusteringalgorithmforanalyzinggeneexpressiondata AT zhaoxiaoyu arbicanallroundbiclusteringalgorithmforanalyzinggeneexpressiondata AT longchaoyi arbicanallroundbiclusteringalgorithmforanalyzinggeneexpressiondata AT hanrenmin arbicanallroundbiclusteringalgorithmforanalyzinggeneexpressiondata AT suzhengchang arbicanallroundbiclusteringalgorithmforanalyzinggeneexpressiondata AT liguojun arbicanallroundbiclusteringalgorithmforanalyzinggeneexpressiondata |