Cargando…

A systematic comparative evaluation of biclustering techniques

BACKGROUND: Biclustering techniques are capable of simultaneously clustering rows and columns of a data matrix. These techniques became very popular for the analysis of gene expression data, since a gene can take part of multiple biological pathways which in turn can be active only under specific ex...

Descripción completa

Detalles Bibliográficos
Autores principales: Padilha, Victor A., Campello, Ricardo J. G. B.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5259837/
https://www.ncbi.nlm.nih.gov/pubmed/28114903
http://dx.doi.org/10.1186/s12859-017-1487-1
_version_ 1782499283826638848
author Padilha, Victor A.
Campello, Ricardo J. G. B.
author_facet Padilha, Victor A.
Campello, Ricardo J. G. B.
author_sort Padilha, Victor A.
collection PubMed
description BACKGROUND: Biclustering techniques are capable of simultaneously clustering rows and columns of a data matrix. These techniques became very popular for the analysis of gene expression data, since a gene can take part of multiple biological pathways which in turn can be active only under specific experimental conditions. Several biclustering algorithms have been developed in the past recent years. In order to provide guidance regarding their choice, a few comparative studies were conducted and reported in the literature. In these studies, however, the performances of the methods were evaluated through external measures that have more recently been shown to have undesirable properties. Furthermore, they considered a limited number of algorithms and datasets. RESULTS: We conducted a broader comparative study involving seventeen algorithms, which were run on three synthetic data collections and two real data collections with a more representative number of datasets. For the experiments with synthetic data, five different experimental scenarios were studied: different levels of noise, different numbers of implanted biclusters, different levels of symmetric bicluster overlap, different levels of asymmetric bicluster overlap and different bicluster sizes, for which the results were assessed with more suitable external measures. For the experiments with real datasets, the results were assessed by gene set enrichment and clustering accuracy. CONCLUSIONS: We observed that each algorithm achieved satisfactory results in part of the biclustering tasks in which they were investigated. The choice of the best algorithm for some application thus depends on the task at hand and the types of patterns that one wants to detect.
format Online
Article
Text
id pubmed-5259837
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-52598372017-01-26 A systematic comparative evaluation of biclustering techniques Padilha, Victor A. Campello, Ricardo J. G. B. BMC Bioinformatics Research Article BACKGROUND: Biclustering techniques are capable of simultaneously clustering rows and columns of a data matrix. These techniques became very popular for the analysis of gene expression data, since a gene can take part of multiple biological pathways which in turn can be active only under specific experimental conditions. Several biclustering algorithms have been developed in the past recent years. In order to provide guidance regarding their choice, a few comparative studies were conducted and reported in the literature. In these studies, however, the performances of the methods were evaluated through external measures that have more recently been shown to have undesirable properties. Furthermore, they considered a limited number of algorithms and datasets. RESULTS: We conducted a broader comparative study involving seventeen algorithms, which were run on three synthetic data collections and two real data collections with a more representative number of datasets. For the experiments with synthetic data, five different experimental scenarios were studied: different levels of noise, different numbers of implanted biclusters, different levels of symmetric bicluster overlap, different levels of asymmetric bicluster overlap and different bicluster sizes, for which the results were assessed with more suitable external measures. For the experiments with real datasets, the results were assessed by gene set enrichment and clustering accuracy. CONCLUSIONS: We observed that each algorithm achieved satisfactory results in part of the biclustering tasks in which they were investigated. The choice of the best algorithm for some application thus depends on the task at hand and the types of patterns that one wants to detect. BioMed Central 2017-01-23 /pmc/articles/PMC5259837/ /pubmed/28114903 http://dx.doi.org/10.1186/s12859-017-1487-1 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Padilha, Victor A.
Campello, Ricardo J. G. B.
A systematic comparative evaluation of biclustering techniques
title A systematic comparative evaluation of biclustering techniques
title_full A systematic comparative evaluation of biclustering techniques
title_fullStr A systematic comparative evaluation of biclustering techniques
title_full_unstemmed A systematic comparative evaluation of biclustering techniques
title_short A systematic comparative evaluation of biclustering techniques
title_sort systematic comparative evaluation of biclustering techniques
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5259837/
https://www.ncbi.nlm.nih.gov/pubmed/28114903
http://dx.doi.org/10.1186/s12859-017-1487-1
work_keys_str_mv AT padilhavictora asystematiccomparativeevaluationofbiclusteringtechniques
AT campelloricardojgb asystematiccomparativeevaluationofbiclusteringtechniques
AT padilhavictora systematiccomparativeevaluationofbiclusteringtechniques
AT campelloricardojgb systematiccomparativeevaluationofbiclusteringtechniques