Cargando…

A phase synchronization clustering algorithm for identifying interesting groups of genes from cell cycle expression data

BACKGROUND: The previous studies of genome-wide expression patterns show that a certain percentage of genes are cell cycle regulated. The expression data has been analyzed in a number of different ways to identify cell cycle dependent genes. In this study, we pose the hypothesis that cell cycle depe...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, Chang Sik, Bae, Cheol Soo, Tcha, Hong Joon
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2335309/
https://www.ncbi.nlm.nih.gov/pubmed/18221564
http://dx.doi.org/10.1186/1471-2105-9-56
_version_ 1782152822036365312
author Kim, Chang Sik
Bae, Cheol Soo
Tcha, Hong Joon
author_facet Kim, Chang Sik
Bae, Cheol Soo
Tcha, Hong Joon
author_sort Kim, Chang Sik
collection PubMed
description BACKGROUND: The previous studies of genome-wide expression patterns show that a certain percentage of genes are cell cycle regulated. The expression data has been analyzed in a number of different ways to identify cell cycle dependent genes. In this study, we pose the hypothesis that cell cycle dependent genes are considered as oscillating systems with a rhythm, i.e. systems producing response signals with period and frequency. Therefore, we are motivated to apply the theory of multivariate phase synchronization for clustering cell cycle specific genome-wide expression data. RESULTS: We propose the strategy to find groups of genes according to the specific biological process by analyzing cell cycle specific gene expression data. To evaluate the propose method, we use the modified Kuramoto model, which is a phase governing equation that provides the long-term dynamics of globally coupled oscillators. With this equation, we simulate two groups of expression signals, and the simulated signals from each group shares their own common rhythm. Then, the simulated expression data are mixed with randomly generated expression data to be used as input data set to the algorithm. Using these simulated expression data, it is shown that the algorithm is able to identify expression signals that are involved in the same oscillating process. We also evaluate the method with yeast cell cycle expression data. It is shown that the output clusters by the proposed algorithm include genes, which are closely associated with each other by sharing significant Gene Ontology terms of biological process and/or having relatively many known biological interactions. Therefore, the evaluation analysis indicates that the method is able to identify expression signals according to the specific biological process. Our evaluation analysis also indicates that some portion of output by the proposed algorithm is not obtainable by the traditional clustering algorithm with Euclidean distance or linear correlation. CONCLUSION: Based on the evaluation experiments, we draw the conclusion as follows: 1) Based on the theory of multivariate phase synchronization, it is feasible to find groups of genes, which have relevant biological interactions and/or significantly shared GO slim terms of biological process, using cell cycle specific gene expression signals. 2) Among all the output clusters by the proposed algorithm, the cluster with relatively large size has a tendency to include more known interactions than the one with relatively small size. 3) It is feasible to understand the cell cycle specific gene expression patterns as the phenomenon of collective synchronization. 4) The proposed algorithm is able to find prominent groups of genes, which are not obtainable by traditional clustering algorithm.
format Text
id pubmed-2335309
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-23353092008-04-28 A phase synchronization clustering algorithm for identifying interesting groups of genes from cell cycle expression data Kim, Chang Sik Bae, Cheol Soo Tcha, Hong Joon BMC Bioinformatics Research Article BACKGROUND: The previous studies of genome-wide expression patterns show that a certain percentage of genes are cell cycle regulated. The expression data has been analyzed in a number of different ways to identify cell cycle dependent genes. In this study, we pose the hypothesis that cell cycle dependent genes are considered as oscillating systems with a rhythm, i.e. systems producing response signals with period and frequency. Therefore, we are motivated to apply the theory of multivariate phase synchronization for clustering cell cycle specific genome-wide expression data. RESULTS: We propose the strategy to find groups of genes according to the specific biological process by analyzing cell cycle specific gene expression data. To evaluate the propose method, we use the modified Kuramoto model, which is a phase governing equation that provides the long-term dynamics of globally coupled oscillators. With this equation, we simulate two groups of expression signals, and the simulated signals from each group shares their own common rhythm. Then, the simulated expression data are mixed with randomly generated expression data to be used as input data set to the algorithm. Using these simulated expression data, it is shown that the algorithm is able to identify expression signals that are involved in the same oscillating process. We also evaluate the method with yeast cell cycle expression data. It is shown that the output clusters by the proposed algorithm include genes, which are closely associated with each other by sharing significant Gene Ontology terms of biological process and/or having relatively many known biological interactions. Therefore, the evaluation analysis indicates that the method is able to identify expression signals according to the specific biological process. Our evaluation analysis also indicates that some portion of output by the proposed algorithm is not obtainable by the traditional clustering algorithm with Euclidean distance or linear correlation. CONCLUSION: Based on the evaluation experiments, we draw the conclusion as follows: 1) Based on the theory of multivariate phase synchronization, it is feasible to find groups of genes, which have relevant biological interactions and/or significantly shared GO slim terms of biological process, using cell cycle specific gene expression signals. 2) Among all the output clusters by the proposed algorithm, the cluster with relatively large size has a tendency to include more known interactions than the one with relatively small size. 3) It is feasible to understand the cell cycle specific gene expression patterns as the phenomenon of collective synchronization. 4) The proposed algorithm is able to find prominent groups of genes, which are not obtainable by traditional clustering algorithm. BioMed Central 2008-01-28 /pmc/articles/PMC2335309/ /pubmed/18221564 http://dx.doi.org/10.1186/1471-2105-9-56 Text en Copyright © 2008 Kim et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Kim, Chang Sik
Bae, Cheol Soo
Tcha, Hong Joon
A phase synchronization clustering algorithm for identifying interesting groups of genes from cell cycle expression data
title A phase synchronization clustering algorithm for identifying interesting groups of genes from cell cycle expression data
title_full A phase synchronization clustering algorithm for identifying interesting groups of genes from cell cycle expression data
title_fullStr A phase synchronization clustering algorithm for identifying interesting groups of genes from cell cycle expression data
title_full_unstemmed A phase synchronization clustering algorithm for identifying interesting groups of genes from cell cycle expression data
title_short A phase synchronization clustering algorithm for identifying interesting groups of genes from cell cycle expression data
title_sort phase synchronization clustering algorithm for identifying interesting groups of genes from cell cycle expression data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2335309/
https://www.ncbi.nlm.nih.gov/pubmed/18221564
http://dx.doi.org/10.1186/1471-2105-9-56
work_keys_str_mv AT kimchangsik aphasesynchronizationclusteringalgorithmforidentifyinginterestinggroupsofgenesfromcellcycleexpressiondata
AT baecheolsoo aphasesynchronizationclusteringalgorithmforidentifyinginterestinggroupsofgenesfromcellcycleexpressiondata
AT tchahongjoon aphasesynchronizationclusteringalgorithmforidentifyinginterestinggroupsofgenesfromcellcycleexpressiondata
AT kimchangsik phasesynchronizationclusteringalgorithmforidentifyinginterestinggroupsofgenesfromcellcycleexpressiondata
AT baecheolsoo phasesynchronizationclusteringalgorithmforidentifyinginterestinggroupsofgenesfromcellcycleexpressiondata
AT tchahongjoon phasesynchronizationclusteringalgorithmforidentifyinginterestinggroupsofgenesfromcellcycleexpressiondata