Cargando…

Modeling gene expression measurement error: a quasi-likelihood approach

BACKGROUND: Using suitable error models for gene expression measurements is essential in the statistical analysis of microarray data. However, the true probabilistic model underlying gene expression intensity readings is generally not known. Instead, in currently used approaches some simple parametr...

Descripción completa

Detalles Bibliográficos
Autor principal: Strimmer, Korbinian
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2003
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC153502/
https://www.ncbi.nlm.nih.gov/pubmed/12659637
http://dx.doi.org/10.1186/1471-2105-4-10
_version_ 1782120713521463296
author Strimmer, Korbinian
author_facet Strimmer, Korbinian
author_sort Strimmer, Korbinian
collection PubMed
description BACKGROUND: Using suitable error models for gene expression measurements is essential in the statistical analysis of microarray data. However, the true probabilistic model underlying gene expression intensity readings is generally not known. Instead, in currently used approaches some simple parametric model is assumed (usually a transformed normal distribution) or the empirical distribution is estimated. However, both these strategies may not be optimal for gene expression data, as the non-parametric approach ignores known structural information whereas the fully parametric models run the risk of misspecification. A further related problem is the choice of a suitable scale for the model (e.g. observed vs. log-scale). RESULTS: Here a simple semi-parametric model for gene expression measurement error is presented. In this approach inference is based an approximate likelihood function (the extended quasi-likelihood). Only partial knowledge about the unknown true distribution is required to construct this function. In case of gene expression this information is available in the form of the postulated (e.g. quadratic) variance structure of the data. As the quasi-likelihood behaves (almost) like a proper likelihood, it allows for the estimation of calibration and variance parameters, and it is also straightforward to obtain corresponding approximate confidence intervals. Unlike most other frameworks, it also allows analysis on any preferred scale, i.e. both on the original linear scale as well as on a transformed scale. It can also be employed in regression approaches to model systematic (e.g. array or dye) effects. CONCLUSIONS: The quasi-likelihood framework provides a simple and versatile approach to analyze gene expression data that does not make any strong distributional assumptions about the underlying error model. For several simulated as well as real data sets it provides a better fit to the data than competing models. In an example it also improved the power of tests to identify differential expression.
format Text
id pubmed-153502
institution National Center for Biotechnology Information
language English
publishDate 2003
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-1535022003-04-19 Modeling gene expression measurement error: a quasi-likelihood approach Strimmer, Korbinian BMC Bioinformatics Research Article BACKGROUND: Using suitable error models for gene expression measurements is essential in the statistical analysis of microarray data. However, the true probabilistic model underlying gene expression intensity readings is generally not known. Instead, in currently used approaches some simple parametric model is assumed (usually a transformed normal distribution) or the empirical distribution is estimated. However, both these strategies may not be optimal for gene expression data, as the non-parametric approach ignores known structural information whereas the fully parametric models run the risk of misspecification. A further related problem is the choice of a suitable scale for the model (e.g. observed vs. log-scale). RESULTS: Here a simple semi-parametric model for gene expression measurement error is presented. In this approach inference is based an approximate likelihood function (the extended quasi-likelihood). Only partial knowledge about the unknown true distribution is required to construct this function. In case of gene expression this information is available in the form of the postulated (e.g. quadratic) variance structure of the data. As the quasi-likelihood behaves (almost) like a proper likelihood, it allows for the estimation of calibration and variance parameters, and it is also straightforward to obtain corresponding approximate confidence intervals. Unlike most other frameworks, it also allows analysis on any preferred scale, i.e. both on the original linear scale as well as on a transformed scale. It can also be employed in regression approaches to model systematic (e.g. array or dye) effects. CONCLUSIONS: The quasi-likelihood framework provides a simple and versatile approach to analyze gene expression data that does not make any strong distributional assumptions about the underlying error model. For several simulated as well as real data sets it provides a better fit to the data than competing models. In an example it also improved the power of tests to identify differential expression. BioMed Central 2003-03-20 /pmc/articles/PMC153502/ /pubmed/12659637 http://dx.doi.org/10.1186/1471-2105-4-10 Text en Copyright © 2003 Strimmer; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.
spellingShingle Research Article
Strimmer, Korbinian
Modeling gene expression measurement error: a quasi-likelihood approach
title Modeling gene expression measurement error: a quasi-likelihood approach
title_full Modeling gene expression measurement error: a quasi-likelihood approach
title_fullStr Modeling gene expression measurement error: a quasi-likelihood approach
title_full_unstemmed Modeling gene expression measurement error: a quasi-likelihood approach
title_short Modeling gene expression measurement error: a quasi-likelihood approach
title_sort modeling gene expression measurement error: a quasi-likelihood approach
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC153502/
https://www.ncbi.nlm.nih.gov/pubmed/12659637
http://dx.doi.org/10.1186/1471-2105-4-10
work_keys_str_mv AT strimmerkorbinian modelinggeneexpressionmeasurementerroraquasilikelihoodapproach