Cargando…

Predictive response-relevant clustering of expression data provides insights into disease processes

This article describes and illustrates a novel method of microarray data analysis that couples model-based clustering and binary classification to form clusters of `response-relevant' genes; that is, genes that are informative when discriminating between the different values of the response. Pr...

Descripción completa

Detalles Bibliográficos
Autores principales: Hopcroft, Lisa E. M., McBride, Martin W., Harris, Keith J., Sampson, Amanda K., McClure, John D., Graham, Delyth, Young, Graham, Holyoake, Tessa L., Girolami, Mark A., Dominiczak, Anna F.
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2978340/
https://www.ncbi.nlm.nih.gov/pubmed/20571087
http://dx.doi.org/10.1093/nar/gkq550
_version_ 1782191244842106880
author Hopcroft, Lisa E. M.
McBride, Martin W.
Harris, Keith J.
Sampson, Amanda K.
McClure, John D.
Graham, Delyth
Young, Graham
Holyoake, Tessa L.
Girolami, Mark A.
Dominiczak, Anna F.
author_facet Hopcroft, Lisa E. M.
McBride, Martin W.
Harris, Keith J.
Sampson, Amanda K.
McClure, John D.
Graham, Delyth
Young, Graham
Holyoake, Tessa L.
Girolami, Mark A.
Dominiczak, Anna F.
author_sort Hopcroft, Lisa E. M.
collection PubMed
description This article describes and illustrates a novel method of microarray data analysis that couples model-based clustering and binary classification to form clusters of `response-relevant' genes; that is, genes that are informative when discriminating between the different values of the response. Predictions are subsequently made using an appropriate statistical summary of each gene cluster, which we call the `meta-covariate' representation of the cluster, in a probit regression model. We first illustrate this method by analysing a leukaemia expression dataset, before focusing closely on the meta-covariate analysis of a renal gene expression dataset in a rat model of salt-sensitive hypertension. We explore the biological insights provided by our analysis of these data. In particular, we identify a highly influential cluster of 13 genes—including three transcription factors (Arntl, Bhlhe41 and Npas2)—that is implicated as being protective against hypertension in response to increased dietary sodium. Functional and canonical pathway analysis of this cluster using Ingenuity Pathway Analysis implicated transcriptional activation and circadian rhythm signalling, respectively. Although we illustrate our method using only expression data, the method is applicable to any high-dimensional datasets. Expression data are available at ArrayExpress (accession number E-MEXP-2514) and code is available at http://www.dcs.gla.ac.uk/inference/metacovariateanalysis/.
format Text
id pubmed-2978340
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-29783402010-11-12 Predictive response-relevant clustering of expression data provides insights into disease processes Hopcroft, Lisa E. M. McBride, Martin W. Harris, Keith J. Sampson, Amanda K. McClure, John D. Graham, Delyth Young, Graham Holyoake, Tessa L. Girolami, Mark A. Dominiczak, Anna F. Nucleic Acids Res Computational Biology This article describes and illustrates a novel method of microarray data analysis that couples model-based clustering and binary classification to form clusters of `response-relevant' genes; that is, genes that are informative when discriminating between the different values of the response. Predictions are subsequently made using an appropriate statistical summary of each gene cluster, which we call the `meta-covariate' representation of the cluster, in a probit regression model. We first illustrate this method by analysing a leukaemia expression dataset, before focusing closely on the meta-covariate analysis of a renal gene expression dataset in a rat model of salt-sensitive hypertension. We explore the biological insights provided by our analysis of these data. In particular, we identify a highly influential cluster of 13 genes—including three transcription factors (Arntl, Bhlhe41 and Npas2)—that is implicated as being protective against hypertension in response to increased dietary sodium. Functional and canonical pathway analysis of this cluster using Ingenuity Pathway Analysis implicated transcriptional activation and circadian rhythm signalling, respectively. Although we illustrate our method using only expression data, the method is applicable to any high-dimensional datasets. Expression data are available at ArrayExpress (accession number E-MEXP-2514) and code is available at http://www.dcs.gla.ac.uk/inference/metacovariateanalysis/. Oxford University Press 2010-11 2010-06-22 /pmc/articles/PMC2978340/ /pubmed/20571087 http://dx.doi.org/10.1093/nar/gkq550 Text en © The Author(s) 2010. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.5 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Computational Biology
Hopcroft, Lisa E. M.
McBride, Martin W.
Harris, Keith J.
Sampson, Amanda K.
McClure, John D.
Graham, Delyth
Young, Graham
Holyoake, Tessa L.
Girolami, Mark A.
Dominiczak, Anna F.
Predictive response-relevant clustering of expression data provides insights into disease processes
title Predictive response-relevant clustering of expression data provides insights into disease processes
title_full Predictive response-relevant clustering of expression data provides insights into disease processes
title_fullStr Predictive response-relevant clustering of expression data provides insights into disease processes
title_full_unstemmed Predictive response-relevant clustering of expression data provides insights into disease processes
title_short Predictive response-relevant clustering of expression data provides insights into disease processes
title_sort predictive response-relevant clustering of expression data provides insights into disease processes
topic Computational Biology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2978340/
https://www.ncbi.nlm.nih.gov/pubmed/20571087
http://dx.doi.org/10.1093/nar/gkq550
work_keys_str_mv AT hopcroftlisaem predictiveresponserelevantclusteringofexpressiondataprovidesinsightsintodiseaseprocesses
AT mcbridemartinw predictiveresponserelevantclusteringofexpressiondataprovidesinsightsintodiseaseprocesses
AT harriskeithj predictiveresponserelevantclusteringofexpressiondataprovidesinsightsintodiseaseprocesses
AT sampsonamandak predictiveresponserelevantclusteringofexpressiondataprovidesinsightsintodiseaseprocesses
AT mcclurejohnd predictiveresponserelevantclusteringofexpressiondataprovidesinsightsintodiseaseprocesses
AT grahamdelyth predictiveresponserelevantclusteringofexpressiondataprovidesinsightsintodiseaseprocesses
AT younggraham predictiveresponserelevantclusteringofexpressiondataprovidesinsightsintodiseaseprocesses
AT holyoaketessal predictiveresponserelevantclusteringofexpressiondataprovidesinsightsintodiseaseprocesses
AT girolamimarka predictiveresponserelevantclusteringofexpressiondataprovidesinsightsintodiseaseprocesses
AT dominiczakannaf predictiveresponserelevantclusteringofexpressiondataprovidesinsightsintodiseaseprocesses