Cargando…

A formal concept analysis approach to consensus clustering of multi-experiment expression data

BACKGROUND: Presently, with the increasing number and complexity of available gene expression datasets, the combination of data from multiple microarray studies addressing a similar biological question is gaining importance. The analysis and integration of multiple datasets are expected to yield mor...

Descripción completa

Detalles Bibliográficos
Autores principales: Hristoskova, Anna, Boeva, Veselka, Tsiporkova, Elena
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4033618/
https://www.ncbi.nlm.nih.gov/pubmed/24885407
http://dx.doi.org/10.1186/1471-2105-15-151
_version_ 1782317850214531072
author Hristoskova, Anna
Boeva, Veselka
Tsiporkova, Elena
author_facet Hristoskova, Anna
Boeva, Veselka
Tsiporkova, Elena
author_sort Hristoskova, Anna
collection PubMed
description BACKGROUND: Presently, with the increasing number and complexity of available gene expression datasets, the combination of data from multiple microarray studies addressing a similar biological question is gaining importance. The analysis and integration of multiple datasets are expected to yield more reliable and robust results since they are based on a larger number of samples and the effects of the individual study-specific biases are diminished. This is supported by recent studies suggesting that important biological signals are often preserved or enhanced by multiple experiments. An approach to combining data from different experiments is the aggregation of their clusterings into a consensus or representative clustering solution which increases the confidence in the common features of all the datasets and reveals the important differences among them. RESULTS: We propose a novel generic consensus clustering technique that applies Formal Concept Analysis (FCA) approach for the consolidation and analysis of clustering solutions derived from several microarray datasets. These datasets are initially divided into groups of related experiments with respect to a predefined criterion. Subsequently, a consensus clustering algorithm is applied to each group resulting in a clustering solution per group. These solutions are pooled together and further analysed by employing FCA which allows extracting valuable insights from the data and generating a gene partition over all the experiments. In order to validate the FCA-enhanced approach two consensus clustering algorithms are adapted to incorporate the FCA analysis. Their performance is evaluated on gene expression data from multi-experiment study examining the global cell-cycle control of fission yeast. The FCA results derived from both methods demonstrate that, although both algorithms optimize different clustering characteristics, FCA is able to overcome and diminish these differences and preserve some relevant biological signals. CONCLUSIONS: The proposed FCA-enhanced consensus clustering technique is a general approach to the combination of clustering algorithms with FCA for deriving clustering solutions from multiple gene expression matrices. The experimental results presented herein demonstrate that it is a robust data integration technique able to produce good quality clustering solution that is representative for the whole set of expression matrices.
format Online
Article
Text
id pubmed-4033618
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-40336182014-06-10 A formal concept analysis approach to consensus clustering of multi-experiment expression data Hristoskova, Anna Boeva, Veselka Tsiporkova, Elena BMC Bioinformatics Methodology Article BACKGROUND: Presently, with the increasing number and complexity of available gene expression datasets, the combination of data from multiple microarray studies addressing a similar biological question is gaining importance. The analysis and integration of multiple datasets are expected to yield more reliable and robust results since they are based on a larger number of samples and the effects of the individual study-specific biases are diminished. This is supported by recent studies suggesting that important biological signals are often preserved or enhanced by multiple experiments. An approach to combining data from different experiments is the aggregation of their clusterings into a consensus or representative clustering solution which increases the confidence in the common features of all the datasets and reveals the important differences among them. RESULTS: We propose a novel generic consensus clustering technique that applies Formal Concept Analysis (FCA) approach for the consolidation and analysis of clustering solutions derived from several microarray datasets. These datasets are initially divided into groups of related experiments with respect to a predefined criterion. Subsequently, a consensus clustering algorithm is applied to each group resulting in a clustering solution per group. These solutions are pooled together and further analysed by employing FCA which allows extracting valuable insights from the data and generating a gene partition over all the experiments. In order to validate the FCA-enhanced approach two consensus clustering algorithms are adapted to incorporate the FCA analysis. Their performance is evaluated on gene expression data from multi-experiment study examining the global cell-cycle control of fission yeast. The FCA results derived from both methods demonstrate that, although both algorithms optimize different clustering characteristics, FCA is able to overcome and diminish these differences and preserve some relevant biological signals. CONCLUSIONS: The proposed FCA-enhanced consensus clustering technique is a general approach to the combination of clustering algorithms with FCA for deriving clustering solutions from multiple gene expression matrices. The experimental results presented herein demonstrate that it is a robust data integration technique able to produce good quality clustering solution that is representative for the whole set of expression matrices. BioMed Central 2014-05-19 /pmc/articles/PMC4033618/ /pubmed/24885407 http://dx.doi.org/10.1186/1471-2105-15-151 Text en Copyright © 2014 Hristoskova et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Hristoskova, Anna
Boeva, Veselka
Tsiporkova, Elena
A formal concept analysis approach to consensus clustering of multi-experiment expression data
title A formal concept analysis approach to consensus clustering of multi-experiment expression data
title_full A formal concept analysis approach to consensus clustering of multi-experiment expression data
title_fullStr A formal concept analysis approach to consensus clustering of multi-experiment expression data
title_full_unstemmed A formal concept analysis approach to consensus clustering of multi-experiment expression data
title_short A formal concept analysis approach to consensus clustering of multi-experiment expression data
title_sort formal concept analysis approach to consensus clustering of multi-experiment expression data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4033618/
https://www.ncbi.nlm.nih.gov/pubmed/24885407
http://dx.doi.org/10.1186/1471-2105-15-151
work_keys_str_mv AT hristoskovaanna aformalconceptanalysisapproachtoconsensusclusteringofmultiexperimentexpressiondata
AT boevaveselka aformalconceptanalysisapproachtoconsensusclusteringofmultiexperimentexpressiondata
AT tsiporkovaelena aformalconceptanalysisapproachtoconsensusclusteringofmultiexperimentexpressiondata
AT hristoskovaanna formalconceptanalysisapproachtoconsensusclusteringofmultiexperimentexpressiondata
AT boevaveselka formalconceptanalysisapproachtoconsensusclusteringofmultiexperimentexpressiondata
AT tsiporkovaelena formalconceptanalysisapproachtoconsensusclusteringofmultiexperimentexpressiondata