Cargando…

Clustering of time-course gene expression profiles using normal mixture models with autoregressive random effects

BACKGROUND: Time-course gene expression data such as yeast cell cycle data may be periodically expressed. To cluster such data, currently used Fourier series approximations of periodic gene expressions have been found not to be sufficiently adequate to model the complexity of the time-course data, p...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Kui, Ng, Shu Kay, McLachlan, Geoffrey J
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3574839/
https://www.ncbi.nlm.nih.gov/pubmed/23151154
http://dx.doi.org/10.1186/1471-2105-13-300
_version_ 1782259645355655168
author Wang, Kui
Ng, Shu Kay
McLachlan, Geoffrey J
author_facet Wang, Kui
Ng, Shu Kay
McLachlan, Geoffrey J
author_sort Wang, Kui
collection PubMed
description BACKGROUND: Time-course gene expression data such as yeast cell cycle data may be periodically expressed. To cluster such data, currently used Fourier series approximations of periodic gene expressions have been found not to be sufficiently adequate to model the complexity of the time-course data, partly due to their ignoring the dependence between the expression measurements over time and the correlation among gene expression profiles. We further investigate the advantages and limitations of available models in the literature and propose a new mixture model with autoregressive random effects of the first order for the clustering of time-course gene-expression profiles. Some simulations and real examples are given to demonstrate the usefulness of the proposed models. RESULTS: We illustrate the applicability of our new model using synthetic and real time-course datasets. We show that our model outperforms existing models to provide more reliable and robust clustering of time-course data. Our model provides superior results when genetic profiles are correlated. It also gives comparable results when the correlation between the gene profiles is weak. In the applications to real time-course data, relevant clusters of coregulated genes are obtained, which are supported by gene-function annotation databases. CONCLUSIONS: Our new model under our extension of the EMMIX-WIRE procedure is more reliable and robust for clustering time-course data because it adopts a random effects model that allows for the correlation among observations at different time points. It postulates gene-specific random effects with an autocorrelation variance structure that models coregulation within the clusters. The developed R package is flexible in its specification of the random effects through user-input parameters that enables improved modelling and consequent clustering of time-course data.
format Online
Article
Text
id pubmed-3574839
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35748392013-02-20 Clustering of time-course gene expression profiles using normal mixture models with autoregressive random effects Wang, Kui Ng, Shu Kay McLachlan, Geoffrey J BMC Bioinformatics Methodology Article BACKGROUND: Time-course gene expression data such as yeast cell cycle data may be periodically expressed. To cluster such data, currently used Fourier series approximations of periodic gene expressions have been found not to be sufficiently adequate to model the complexity of the time-course data, partly due to their ignoring the dependence between the expression measurements over time and the correlation among gene expression profiles. We further investigate the advantages and limitations of available models in the literature and propose a new mixture model with autoregressive random effects of the first order for the clustering of time-course gene-expression profiles. Some simulations and real examples are given to demonstrate the usefulness of the proposed models. RESULTS: We illustrate the applicability of our new model using synthetic and real time-course datasets. We show that our model outperforms existing models to provide more reliable and robust clustering of time-course data. Our model provides superior results when genetic profiles are correlated. It also gives comparable results when the correlation between the gene profiles is weak. In the applications to real time-course data, relevant clusters of coregulated genes are obtained, which are supported by gene-function annotation databases. CONCLUSIONS: Our new model under our extension of the EMMIX-WIRE procedure is more reliable and robust for clustering time-course data because it adopts a random effects model that allows for the correlation among observations at different time points. It postulates gene-specific random effects with an autocorrelation variance structure that models coregulation within the clusters. The developed R package is flexible in its specification of the random effects through user-input parameters that enables improved modelling and consequent clustering of time-course data. BioMed Central 2012-11-14 /pmc/articles/PMC3574839/ /pubmed/23151154 http://dx.doi.org/10.1186/1471-2105-13-300 Text en Copyright ©2012 Wang et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License(http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Wang, Kui
Ng, Shu Kay
McLachlan, Geoffrey J
Clustering of time-course gene expression profiles using normal mixture models with autoregressive random effects
title Clustering of time-course gene expression profiles using normal mixture models with autoregressive random effects
title_full Clustering of time-course gene expression profiles using normal mixture models with autoregressive random effects
title_fullStr Clustering of time-course gene expression profiles using normal mixture models with autoregressive random effects
title_full_unstemmed Clustering of time-course gene expression profiles using normal mixture models with autoregressive random effects
title_short Clustering of time-course gene expression profiles using normal mixture models with autoregressive random effects
title_sort clustering of time-course gene expression profiles using normal mixture models with autoregressive random effects
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3574839/
https://www.ncbi.nlm.nih.gov/pubmed/23151154
http://dx.doi.org/10.1186/1471-2105-13-300
work_keys_str_mv AT wangkui clusteringoftimecoursegeneexpressionprofilesusingnormalmixturemodelswithautoregressiverandomeffects
AT ngshukay clusteringoftimecoursegeneexpressionprofilesusingnormalmixturemodelswithautoregressiverandomeffects
AT mclachlangeoffreyj clusteringoftimecoursegeneexpressionprofilesusingnormalmixturemodelswithautoregressiverandomeffects