Cargando…

Overdispersed logistic regression for SAGE: Modelling multiple groups and covariates

BACKGROUND: Two major identifiable sources of variation in data derived from the Serial Analysis of Gene Expression (SAGE) are within-library sampling variability and between-library heterogeneity within a group. Most published methods for identifying differential expression focus on just the sampli...

Descripción completa

Detalles Bibliográficos
Autores principales: Baggerly, Keith A, Deng, Li, Morris, Jeffrey S, Aldaz, C Marcelo
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2004
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC524524/
https://www.ncbi.nlm.nih.gov/pubmed/15469612
http://dx.doi.org/10.1186/1471-2105-5-144
_version_ 1782121911835164672
author Baggerly, Keith A
Deng, Li
Morris, Jeffrey S
Aldaz, C Marcelo
author_facet Baggerly, Keith A
Deng, Li
Morris, Jeffrey S
Aldaz, C Marcelo
author_sort Baggerly, Keith A
collection PubMed
description BACKGROUND: Two major identifiable sources of variation in data derived from the Serial Analysis of Gene Expression (SAGE) are within-library sampling variability and between-library heterogeneity within a group. Most published methods for identifying differential expression focus on just the sampling variability. In recent work, the problem of assessing differential expression between two groups of SAGE libraries has been addressed by introducing a beta-binomial hierarchical model that explicitly deals with both of the above sources of variation. This model leads to a test statistic analogous to a weighted two-sample t-test. When the number of groups involved is more than two, however, a more general approach is needed. RESULTS: We describe how logistic regression with overdispersion supplies this generalization, carrying with it the framework for incorporating other covariates into the model as a byproduct. This approach has the advantage that logistic regression routines are available in several common statistical packages. CONCLUSIONS: The described method provides an easily implemented tool for analyzing SAGE data that correctly handles multiple types of variation and allows for more flexible modelling.
format Text
id pubmed-524524
institution National Center for Biotechnology Information
language English
publishDate 2004
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-5245242004-10-31 Overdispersed logistic regression for SAGE: Modelling multiple groups and covariates Baggerly, Keith A Deng, Li Morris, Jeffrey S Aldaz, C Marcelo BMC Bioinformatics Research Article BACKGROUND: Two major identifiable sources of variation in data derived from the Serial Analysis of Gene Expression (SAGE) are within-library sampling variability and between-library heterogeneity within a group. Most published methods for identifying differential expression focus on just the sampling variability. In recent work, the problem of assessing differential expression between two groups of SAGE libraries has been addressed by introducing a beta-binomial hierarchical model that explicitly deals with both of the above sources of variation. This model leads to a test statistic analogous to a weighted two-sample t-test. When the number of groups involved is more than two, however, a more general approach is needed. RESULTS: We describe how logistic regression with overdispersion supplies this generalization, carrying with it the framework for incorporating other covariates into the model as a byproduct. This approach has the advantage that logistic regression routines are available in several common statistical packages. CONCLUSIONS: The described method provides an easily implemented tool for analyzing SAGE data that correctly handles multiple types of variation and allows for more flexible modelling. BioMed Central 2004-10-06 /pmc/articles/PMC524524/ /pubmed/15469612 http://dx.doi.org/10.1186/1471-2105-5-144 Text en Copyright © 2004 Baggerly et al; licensee BioMed Central Ltd.
spellingShingle Research Article
Baggerly, Keith A
Deng, Li
Morris, Jeffrey S
Aldaz, C Marcelo
Overdispersed logistic regression for SAGE: Modelling multiple groups and covariates
title Overdispersed logistic regression for SAGE: Modelling multiple groups and covariates
title_full Overdispersed logistic regression for SAGE: Modelling multiple groups and covariates
title_fullStr Overdispersed logistic regression for SAGE: Modelling multiple groups and covariates
title_full_unstemmed Overdispersed logistic regression for SAGE: Modelling multiple groups and covariates
title_short Overdispersed logistic regression for SAGE: Modelling multiple groups and covariates
title_sort overdispersed logistic regression for sage: modelling multiple groups and covariates
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC524524/
https://www.ncbi.nlm.nih.gov/pubmed/15469612
http://dx.doi.org/10.1186/1471-2105-5-144
work_keys_str_mv AT baggerlykeitha overdispersedlogisticregressionforsagemodellingmultiplegroupsandcovariates
AT dengli overdispersedlogisticregressionforsagemodellingmultiplegroupsandcovariates
AT morrisjeffreys overdispersedlogisticregressionforsagemodellingmultiplegroupsandcovariates
AT aldazcmarcelo overdispersedlogisticregressionforsagemodellingmultiplegroupsandcovariates