Cargando…
Multiobjective triclustering of time-series transcriptome data reveals key genes of biological processes
BACKGROUND: Exploratory analysis of multi-dimensional high-throughput datasets, such as microarray gene expression time series, may be instrumental in understanding the genetic programs underlying numerous biological processes. In such datasets, variations in the gene expression profiles are usually...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4480927/ https://www.ncbi.nlm.nih.gov/pubmed/26108437 http://dx.doi.org/10.1186/s12859-015-0635-8 |
_version_ | 1782378212179836928 |
---|---|
author | Bhar, Anirban Haubrock, Martin Mukhopadhyay, Anirban Wingender, Edgar |
author_facet | Bhar, Anirban Haubrock, Martin Mukhopadhyay, Anirban Wingender, Edgar |
author_sort | Bhar, Anirban |
collection | PubMed |
description | BACKGROUND: Exploratory analysis of multi-dimensional high-throughput datasets, such as microarray gene expression time series, may be instrumental in understanding the genetic programs underlying numerous biological processes. In such datasets, variations in the gene expression profiles are usually observed across replicates and time points. Thus mining the temporal expression patterns in such multi-dimensional datasets may not only provide insights into the key biological processes governing organs to grow and develop but also facilitate the understanding of the underlying complex gene regulatory circuits. RESULTS: In this work we have developed an evolutionary multi-objective optimization for our previously introduced triclustering algorithm δ-TRIMAX. Its aim is to make optimal use of δ-TRIMAX in extracting groups of co-expressed genes from time series gene expression data, or from any 3D gene expression dataset, by adding the powerful capabilities of an evolutionary algorithm to retrieve overlapping triclusters. We have compared the performance of our newly developed algorithm, EMOA- δ-TRIMAX, with that of other existing triclustering approaches using four artificial dataset and three real-life datasets. Moreover, we have analyzed the results of our algorithm on one of these real-life datasets monitoring the differentiation of human induced pluripotent stem cells (hiPSC) into mature cardiomyocytes. For each group of co-expressed genes belonging to one tricluster, we identified key genes by computing their membership values within the tricluster. It turned out that to a very high percentage, these key genes were significantly enriched in Gene Ontology categories or KEGG pathways that fitted very well to the biological context of cardiomyocytes differentiation. CONCLUSIONS: EMOA- δ-TRIMAX has proven instrumental in identifying groups of genes in transcriptomic data sets that represent the functional categories constituting the biological process under study. The executable file can be found at http://www.bioinf.med.uni-goettingen.de/fileadmin/download/EMOA-delta-TRIMAX.tar.gz. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0635-8) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4480927 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-44809272015-06-26 Multiobjective triclustering of time-series transcriptome data reveals key genes of biological processes Bhar, Anirban Haubrock, Martin Mukhopadhyay, Anirban Wingender, Edgar BMC Bioinformatics Research Article BACKGROUND: Exploratory analysis of multi-dimensional high-throughput datasets, such as microarray gene expression time series, may be instrumental in understanding the genetic programs underlying numerous biological processes. In such datasets, variations in the gene expression profiles are usually observed across replicates and time points. Thus mining the temporal expression patterns in such multi-dimensional datasets may not only provide insights into the key biological processes governing organs to grow and develop but also facilitate the understanding of the underlying complex gene regulatory circuits. RESULTS: In this work we have developed an evolutionary multi-objective optimization for our previously introduced triclustering algorithm δ-TRIMAX. Its aim is to make optimal use of δ-TRIMAX in extracting groups of co-expressed genes from time series gene expression data, or from any 3D gene expression dataset, by adding the powerful capabilities of an evolutionary algorithm to retrieve overlapping triclusters. We have compared the performance of our newly developed algorithm, EMOA- δ-TRIMAX, with that of other existing triclustering approaches using four artificial dataset and three real-life datasets. Moreover, we have analyzed the results of our algorithm on one of these real-life datasets monitoring the differentiation of human induced pluripotent stem cells (hiPSC) into mature cardiomyocytes. For each group of co-expressed genes belonging to one tricluster, we identified key genes by computing their membership values within the tricluster. It turned out that to a very high percentage, these key genes were significantly enriched in Gene Ontology categories or KEGG pathways that fitted very well to the biological context of cardiomyocytes differentiation. CONCLUSIONS: EMOA- δ-TRIMAX has proven instrumental in identifying groups of genes in transcriptomic data sets that represent the functional categories constituting the biological process under study. The executable file can be found at http://www.bioinf.med.uni-goettingen.de/fileadmin/download/EMOA-delta-TRIMAX.tar.gz. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0635-8) contains supplementary material, which is available to authorized users. BioMed Central 2015-06-26 /pmc/articles/PMC4480927/ /pubmed/26108437 http://dx.doi.org/10.1186/s12859-015-0635-8 Text en © Bhar et al.; licensee BioMed Central. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Bhar, Anirban Haubrock, Martin Mukhopadhyay, Anirban Wingender, Edgar Multiobjective triclustering of time-series transcriptome data reveals key genes of biological processes |
title | Multiobjective triclustering of time-series transcriptome data reveals key genes of biological processes |
title_full | Multiobjective triclustering of time-series transcriptome data reveals key genes of biological processes |
title_fullStr | Multiobjective triclustering of time-series transcriptome data reveals key genes of biological processes |
title_full_unstemmed | Multiobjective triclustering of time-series transcriptome data reveals key genes of biological processes |
title_short | Multiobjective triclustering of time-series transcriptome data reveals key genes of biological processes |
title_sort | multiobjective triclustering of time-series transcriptome data reveals key genes of biological processes |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4480927/ https://www.ncbi.nlm.nih.gov/pubmed/26108437 http://dx.doi.org/10.1186/s12859-015-0635-8 |
work_keys_str_mv | AT bharanirban multiobjectivetriclusteringoftimeseriestranscriptomedatarevealskeygenesofbiologicalprocesses AT haubrockmartin multiobjectivetriclusteringoftimeseriestranscriptomedatarevealskeygenesofbiologicalprocesses AT mukhopadhyayanirban multiobjectivetriclusteringoftimeseriestranscriptomedatarevealskeygenesofbiologicalprocesses AT wingenderedgar multiobjectivetriclusteringoftimeseriestranscriptomedatarevealskeygenesofbiologicalprocesses |