Cargando…
An accurate paired sample test for count data
Motivation: Recent technology platforms in proteomics and genomics produce count data for quantitative analysis. Previous works on statistical significance analysis for count data have mainly focused on the independent sample setting, which does not cover the case where pairs of measurements are tak...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3436821/ https://www.ncbi.nlm.nih.gov/pubmed/22962487 http://dx.doi.org/10.1093/bioinformatics/bts394 |
_version_ | 1782242705499226112 |
---|---|
author | Pham, Thang V. Jimenez, Connie R. |
author_facet | Pham, Thang V. Jimenez, Connie R. |
author_sort | Pham, Thang V. |
collection | PubMed |
description | Motivation: Recent technology platforms in proteomics and genomics produce count data for quantitative analysis. Previous works on statistical significance analysis for count data have mainly focused on the independent sample setting, which does not cover the case where pairs of measurements are taken from individual patients before and after treatment. This experimental setting requires paired sample testing such as the paired t-test often used for continuous measurements. A state-of-the-art method uses a negative binomial distribution in a generalized linear model framework for paired sample testing. A paired sample design assumes that the relative change within each pair is constant across biological samples. This model can be used as an approximation to the true model in cases of heterogeneity of response in complex biological systems. We aim to specify the variation in response explicitly in combination with the inherent technical variation. Results: We formulate the problem of paired sample test for count data in a framework of statistical combination of multiple contingency tables. In particular, we specify explicitly a random distribution for the effect with an inverted beta model. The technical variation can be modeled by either a standard Poisson distribution or an exponentiated Poisson distribution, depending on the reproducibility of the acquisition workflow. The new statistical test is evaluated on both proteomics and genomics datasets, showing a comparable performance to the state-of-the-art method in general, and in several cases where the two methods differ, the proposed test returns more reasonable p-values. Availability: Available for download at http://www.oncoproteomics.nl/. Contact: t.pham@vumc.nl |
format | Online Article Text |
id | pubmed-3436821 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-34368212012-12-12 An accurate paired sample test for count data Pham, Thang V. Jimenez, Connie R. Bioinformatics Original Papers Motivation: Recent technology platforms in proteomics and genomics produce count data for quantitative analysis. Previous works on statistical significance analysis for count data have mainly focused on the independent sample setting, which does not cover the case where pairs of measurements are taken from individual patients before and after treatment. This experimental setting requires paired sample testing such as the paired t-test often used for continuous measurements. A state-of-the-art method uses a negative binomial distribution in a generalized linear model framework for paired sample testing. A paired sample design assumes that the relative change within each pair is constant across biological samples. This model can be used as an approximation to the true model in cases of heterogeneity of response in complex biological systems. We aim to specify the variation in response explicitly in combination with the inherent technical variation. Results: We formulate the problem of paired sample test for count data in a framework of statistical combination of multiple contingency tables. In particular, we specify explicitly a random distribution for the effect with an inverted beta model. The technical variation can be modeled by either a standard Poisson distribution or an exponentiated Poisson distribution, depending on the reproducibility of the acquisition workflow. The new statistical test is evaluated on both proteomics and genomics datasets, showing a comparable performance to the state-of-the-art method in general, and in several cases where the two methods differ, the proposed test returns more reasonable p-values. Availability: Available for download at http://www.oncoproteomics.nl/. Contact: t.pham@vumc.nl Oxford University Press 2012-09-15 2012-09-03 /pmc/articles/PMC3436821/ /pubmed/22962487 http://dx.doi.org/10.1093/bioinformatics/bts394 Text en © The Author(s) (2012). Published by Oxford University Press. http://creativecommons.org/licenses/by/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Papers Pham, Thang V. Jimenez, Connie R. An accurate paired sample test for count data |
title | An accurate paired sample test for count data |
title_full | An accurate paired sample test for count data |
title_fullStr | An accurate paired sample test for count data |
title_full_unstemmed | An accurate paired sample test for count data |
title_short | An accurate paired sample test for count data |
title_sort | accurate paired sample test for count data |
topic | Original Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3436821/ https://www.ncbi.nlm.nih.gov/pubmed/22962487 http://dx.doi.org/10.1093/bioinformatics/bts394 |
work_keys_str_mv | AT phamthangv anaccuratepairedsampletestforcountdata AT jimenezconnier anaccuratepairedsampletestforcountdata AT phamthangv accuratepairedsampletestforcountdata AT jimenezconnier accuratepairedsampletestforcountdata |