Cargando…

Multiobjective triclustering of time-series transcriptome data reveals key genes of biological processes

BACKGROUND: Exploratory analysis of multi-dimensional high-throughput datasets, such as microarray gene expression time series, may be instrumental in understanding the genetic programs underlying numerous biological processes. In such datasets, variations in the gene expression profiles are usually...

Descripción completa

Detalles Bibliográficos
Autores principales: Bhar, Anirban, Haubrock, Martin, Mukhopadhyay, Anirban, Wingender, Edgar
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4480927/
https://www.ncbi.nlm.nih.gov/pubmed/26108437
http://dx.doi.org/10.1186/s12859-015-0635-8
_version_ 1782378212179836928
author Bhar, Anirban
Haubrock, Martin
Mukhopadhyay, Anirban
Wingender, Edgar
author_facet Bhar, Anirban
Haubrock, Martin
Mukhopadhyay, Anirban
Wingender, Edgar
author_sort Bhar, Anirban
collection PubMed
description BACKGROUND: Exploratory analysis of multi-dimensional high-throughput datasets, such as microarray gene expression time series, may be instrumental in understanding the genetic programs underlying numerous biological processes. In such datasets, variations in the gene expression profiles are usually observed across replicates and time points. Thus mining the temporal expression patterns in such multi-dimensional datasets may not only provide insights into the key biological processes governing organs to grow and develop but also facilitate the understanding of the underlying complex gene regulatory circuits. RESULTS: In this work we have developed an evolutionary multi-objective optimization for our previously introduced triclustering algorithm δ-TRIMAX. Its aim is to make optimal use of δ-TRIMAX in extracting groups of co-expressed genes from time series gene expression data, or from any 3D gene expression dataset, by adding the powerful capabilities of an evolutionary algorithm to retrieve overlapping triclusters. We have compared the performance of our newly developed algorithm, EMOA- δ-TRIMAX, with that of other existing triclustering approaches using four artificial dataset and three real-life datasets. Moreover, we have analyzed the results of our algorithm on one of these real-life datasets monitoring the differentiation of human induced pluripotent stem cells (hiPSC) into mature cardiomyocytes. For each group of co-expressed genes belonging to one tricluster, we identified key genes by computing their membership values within the tricluster. It turned out that to a very high percentage, these key genes were significantly enriched in Gene Ontology categories or KEGG pathways that fitted very well to the biological context of cardiomyocytes differentiation. CONCLUSIONS: EMOA- δ-TRIMAX has proven instrumental in identifying groups of genes in transcriptomic data sets that represent the functional categories constituting the biological process under study. The executable file can be found at http://www.bioinf.med.uni-goettingen.de/fileadmin/download/EMOA-delta-TRIMAX.tar.gz. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0635-8) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4480927
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-44809272015-06-26 Multiobjective triclustering of time-series transcriptome data reveals key genes of biological processes Bhar, Anirban Haubrock, Martin Mukhopadhyay, Anirban Wingender, Edgar BMC Bioinformatics Research Article BACKGROUND: Exploratory analysis of multi-dimensional high-throughput datasets, such as microarray gene expression time series, may be instrumental in understanding the genetic programs underlying numerous biological processes. In such datasets, variations in the gene expression profiles are usually observed across replicates and time points. Thus mining the temporal expression patterns in such multi-dimensional datasets may not only provide insights into the key biological processes governing organs to grow and develop but also facilitate the understanding of the underlying complex gene regulatory circuits. RESULTS: In this work we have developed an evolutionary multi-objective optimization for our previously introduced triclustering algorithm δ-TRIMAX. Its aim is to make optimal use of δ-TRIMAX in extracting groups of co-expressed genes from time series gene expression data, or from any 3D gene expression dataset, by adding the powerful capabilities of an evolutionary algorithm to retrieve overlapping triclusters. We have compared the performance of our newly developed algorithm, EMOA- δ-TRIMAX, with that of other existing triclustering approaches using four artificial dataset and three real-life datasets. Moreover, we have analyzed the results of our algorithm on one of these real-life datasets monitoring the differentiation of human induced pluripotent stem cells (hiPSC) into mature cardiomyocytes. For each group of co-expressed genes belonging to one tricluster, we identified key genes by computing their membership values within the tricluster. It turned out that to a very high percentage, these key genes were significantly enriched in Gene Ontology categories or KEGG pathways that fitted very well to the biological context of cardiomyocytes differentiation. CONCLUSIONS: EMOA- δ-TRIMAX has proven instrumental in identifying groups of genes in transcriptomic data sets that represent the functional categories constituting the biological process under study. The executable file can be found at http://www.bioinf.med.uni-goettingen.de/fileadmin/download/EMOA-delta-TRIMAX.tar.gz. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0635-8) contains supplementary material, which is available to authorized users. BioMed Central 2015-06-26 /pmc/articles/PMC4480927/ /pubmed/26108437 http://dx.doi.org/10.1186/s12859-015-0635-8 Text en © Bhar et al.; licensee BioMed Central. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Bhar, Anirban
Haubrock, Martin
Mukhopadhyay, Anirban
Wingender, Edgar
Multiobjective triclustering of time-series transcriptome data reveals key genes of biological processes
title Multiobjective triclustering of time-series transcriptome data reveals key genes of biological processes
title_full Multiobjective triclustering of time-series transcriptome data reveals key genes of biological processes
title_fullStr Multiobjective triclustering of time-series transcriptome data reveals key genes of biological processes
title_full_unstemmed Multiobjective triclustering of time-series transcriptome data reveals key genes of biological processes
title_short Multiobjective triclustering of time-series transcriptome data reveals key genes of biological processes
title_sort multiobjective triclustering of time-series transcriptome data reveals key genes of biological processes
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4480927/
https://www.ncbi.nlm.nih.gov/pubmed/26108437
http://dx.doi.org/10.1186/s12859-015-0635-8
work_keys_str_mv AT bharanirban multiobjectivetriclusteringoftimeseriestranscriptomedatarevealskeygenesofbiologicalprocesses
AT haubrockmartin multiobjectivetriclusteringoftimeseriestranscriptomedatarevealskeygenesofbiologicalprocesses
AT mukhopadhyayanirban multiobjectivetriclusteringoftimeseriestranscriptomedatarevealskeygenesofbiologicalprocesses
AT wingenderedgar multiobjectivetriclusteringoftimeseriestranscriptomedatarevealskeygenesofbiologicalprocesses