Cargando…
Predictive response-relevant clustering of expression data provides insights into disease processes
This article describes and illustrates a novel method of microarray data analysis that couples model-based clustering and binary classification to form clusters of `response-relevant' genes; that is, genes that are informative when discriminating between the different values of the response. Pr...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2978340/ https://www.ncbi.nlm.nih.gov/pubmed/20571087 http://dx.doi.org/10.1093/nar/gkq550 |
_version_ | 1782191244842106880 |
---|---|
author | Hopcroft, Lisa E. M. McBride, Martin W. Harris, Keith J. Sampson, Amanda K. McClure, John D. Graham, Delyth Young, Graham Holyoake, Tessa L. Girolami, Mark A. Dominiczak, Anna F. |
author_facet | Hopcroft, Lisa E. M. McBride, Martin W. Harris, Keith J. Sampson, Amanda K. McClure, John D. Graham, Delyth Young, Graham Holyoake, Tessa L. Girolami, Mark A. Dominiczak, Anna F. |
author_sort | Hopcroft, Lisa E. M. |
collection | PubMed |
description | This article describes and illustrates a novel method of microarray data analysis that couples model-based clustering and binary classification to form clusters of `response-relevant' genes; that is, genes that are informative when discriminating between the different values of the response. Predictions are subsequently made using an appropriate statistical summary of each gene cluster, which we call the `meta-covariate' representation of the cluster, in a probit regression model. We first illustrate this method by analysing a leukaemia expression dataset, before focusing closely on the meta-covariate analysis of a renal gene expression dataset in a rat model of salt-sensitive hypertension. We explore the biological insights provided by our analysis of these data. In particular, we identify a highly influential cluster of 13 genes—including three transcription factors (Arntl, Bhlhe41 and Npas2)—that is implicated as being protective against hypertension in response to increased dietary sodium. Functional and canonical pathway analysis of this cluster using Ingenuity Pathway Analysis implicated transcriptional activation and circadian rhythm signalling, respectively. Although we illustrate our method using only expression data, the method is applicable to any high-dimensional datasets. Expression data are available at ArrayExpress (accession number E-MEXP-2514) and code is available at http://www.dcs.gla.ac.uk/inference/metacovariateanalysis/. |
format | Text |
id | pubmed-2978340 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2010 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-29783402010-11-12 Predictive response-relevant clustering of expression data provides insights into disease processes Hopcroft, Lisa E. M. McBride, Martin W. Harris, Keith J. Sampson, Amanda K. McClure, John D. Graham, Delyth Young, Graham Holyoake, Tessa L. Girolami, Mark A. Dominiczak, Anna F. Nucleic Acids Res Computational Biology This article describes and illustrates a novel method of microarray data analysis that couples model-based clustering and binary classification to form clusters of `response-relevant' genes; that is, genes that are informative when discriminating between the different values of the response. Predictions are subsequently made using an appropriate statistical summary of each gene cluster, which we call the `meta-covariate' representation of the cluster, in a probit regression model. We first illustrate this method by analysing a leukaemia expression dataset, before focusing closely on the meta-covariate analysis of a renal gene expression dataset in a rat model of salt-sensitive hypertension. We explore the biological insights provided by our analysis of these data. In particular, we identify a highly influential cluster of 13 genes—including three transcription factors (Arntl, Bhlhe41 and Npas2)—that is implicated as being protective against hypertension in response to increased dietary sodium. Functional and canonical pathway analysis of this cluster using Ingenuity Pathway Analysis implicated transcriptional activation and circadian rhythm signalling, respectively. Although we illustrate our method using only expression data, the method is applicable to any high-dimensional datasets. Expression data are available at ArrayExpress (accession number E-MEXP-2514) and code is available at http://www.dcs.gla.ac.uk/inference/metacovariateanalysis/. Oxford University Press 2010-11 2010-06-22 /pmc/articles/PMC2978340/ /pubmed/20571087 http://dx.doi.org/10.1093/nar/gkq550 Text en © The Author(s) 2010. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.5 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Computational Biology Hopcroft, Lisa E. M. McBride, Martin W. Harris, Keith J. Sampson, Amanda K. McClure, John D. Graham, Delyth Young, Graham Holyoake, Tessa L. Girolami, Mark A. Dominiczak, Anna F. Predictive response-relevant clustering of expression data provides insights into disease processes |
title | Predictive response-relevant clustering of expression data provides insights into disease processes |
title_full | Predictive response-relevant clustering of expression data provides insights into disease processes |
title_fullStr | Predictive response-relevant clustering of expression data provides insights into disease processes |
title_full_unstemmed | Predictive response-relevant clustering of expression data provides insights into disease processes |
title_short | Predictive response-relevant clustering of expression data provides insights into disease processes |
title_sort | predictive response-relevant clustering of expression data provides insights into disease processes |
topic | Computational Biology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2978340/ https://www.ncbi.nlm.nih.gov/pubmed/20571087 http://dx.doi.org/10.1093/nar/gkq550 |
work_keys_str_mv | AT hopcroftlisaem predictiveresponserelevantclusteringofexpressiondataprovidesinsightsintodiseaseprocesses AT mcbridemartinw predictiveresponserelevantclusteringofexpressiondataprovidesinsightsintodiseaseprocesses AT harriskeithj predictiveresponserelevantclusteringofexpressiondataprovidesinsightsintodiseaseprocesses AT sampsonamandak predictiveresponserelevantclusteringofexpressiondataprovidesinsightsintodiseaseprocesses AT mcclurejohnd predictiveresponserelevantclusteringofexpressiondataprovidesinsightsintodiseaseprocesses AT grahamdelyth predictiveresponserelevantclusteringofexpressiondataprovidesinsightsintodiseaseprocesses AT younggraham predictiveresponserelevantclusteringofexpressiondataprovidesinsightsintodiseaseprocesses AT holyoaketessal predictiveresponserelevantclusteringofexpressiondataprovidesinsightsintodiseaseprocesses AT girolamimarka predictiveresponserelevantclusteringofexpressiondataprovidesinsightsintodiseaseprocesses AT dominiczakannaf predictiveresponserelevantclusteringofexpressiondataprovidesinsightsintodiseaseprocesses |