Cargando…

An accurate paired sample test for count data

Motivation: Recent technology platforms in proteomics and genomics produce count data for quantitative analysis. Previous works on statistical significance analysis for count data have mainly focused on the independent sample setting, which does not cover the case where pairs of measurements are tak...

Descripción completa

Detalles Bibliográficos
Autores principales:	Pham, Thang V., Jimenez, Connie R.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2012
Materias:	Original Papers
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3436821/ https://www.ncbi.nlm.nih.gov/pubmed/22962487 http://dx.doi.org/10.1093/bioinformatics/bts394

_version_	1782242705499226112
author	Pham, Thang V. Jimenez, Connie R.
author_facet	Pham, Thang V. Jimenez, Connie R.
author_sort	Pham, Thang V.
collection	PubMed
description	Motivation: Recent technology platforms in proteomics and genomics produce count data for quantitative analysis. Previous works on statistical significance analysis for count data have mainly focused on the independent sample setting, which does not cover the case where pairs of measurements are taken from individual patients before and after treatment. This experimental setting requires paired sample testing such as the paired t-test often used for continuous measurements. A state-of-the-art method uses a negative binomial distribution in a generalized linear model framework for paired sample testing. A paired sample design assumes that the relative change within each pair is constant across biological samples. This model can be used as an approximation to the true model in cases of heterogeneity of response in complex biological systems. We aim to specify the variation in response explicitly in combination with the inherent technical variation. Results: We formulate the problem of paired sample test for count data in a framework of statistical combination of multiple contingency tables. In particular, we specify explicitly a random distribution for the effect with an inverted beta model. The technical variation can be modeled by either a standard Poisson distribution or an exponentiated Poisson distribution, depending on the reproducibility of the acquisition workflow. The new statistical test is evaluated on both proteomics and genomics datasets, showing a comparable performance to the state-of-the-art method in general, and in several cases where the two methods differ, the proposed test returns more reasonable p-values. Availability: Available for download at http://www.oncoproteomics.nl/. Contact: t.pham@vumc.nl
format	Online Article Text
id	pubmed-3436821
institution	National Center for Biotechnology Information
language	English
publishDate	2012
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-34368212012-12-12 An accurate paired sample test for count data Pham, Thang V. Jimenez, Connie R. Bioinformatics Original Papers Motivation: Recent technology platforms in proteomics and genomics produce count data for quantitative analysis. Previous works on statistical significance analysis for count data have mainly focused on the independent sample setting, which does not cover the case where pairs of measurements are taken from individual patients before and after treatment. This experimental setting requires paired sample testing such as the paired t-test often used for continuous measurements. A state-of-the-art method uses a negative binomial distribution in a generalized linear model framework for paired sample testing. A paired sample design assumes that the relative change within each pair is constant across biological samples. This model can be used as an approximation to the true model in cases of heterogeneity of response in complex biological systems. We aim to specify the variation in response explicitly in combination with the inherent technical variation. Results: We formulate the problem of paired sample test for count data in a framework of statistical combination of multiple contingency tables. In particular, we specify explicitly a random distribution for the effect with an inverted beta model. The technical variation can be modeled by either a standard Poisson distribution or an exponentiated Poisson distribution, depending on the reproducibility of the acquisition workflow. The new statistical test is evaluated on both proteomics and genomics datasets, showing a comparable performance to the state-of-the-art method in general, and in several cases where the two methods differ, the proposed test returns more reasonable p-values. Availability: Available for download at http://www.oncoproteomics.nl/. Contact: t.pham@vumc.nl Oxford University Press 2012-09-15 2012-09-03 /pmc/articles/PMC3436821/ /pubmed/22962487 http://dx.doi.org/10.1093/bioinformatics/bts394 Text en © The Author(s) (2012). Published by Oxford University Press. http://creativecommons.org/licenses/by/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Original Papers Pham, Thang V. Jimenez, Connie R. An accurate paired sample test for count data
title	An accurate paired sample test for count data
title_full	An accurate paired sample test for count data
title_fullStr	An accurate paired sample test for count data
title_full_unstemmed	An accurate paired sample test for count data
title_short	An accurate paired sample test for count data
title_sort	accurate paired sample test for count data
topic	Original Papers
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3436821/ https://www.ncbi.nlm.nih.gov/pubmed/22962487 http://dx.doi.org/10.1093/bioinformatics/bts394
work_keys_str_mv	AT phamthangv anaccuratepairedsampletestforcountdata AT jimenezconnier anaccuratepairedsampletestforcountdata AT phamthangv accuratepairedsampletestforcountdata AT jimenezconnier accuratepairedsampletestforcountdata

An accurate paired sample test for count data

Ejemplares similares