Cargando…
Factoring a 2 x 2 contingency table
We show that a two-component proportional representation provides the necessary framework to account for the properties of a 2 × 2 contingency table. This corresponds to the factorization of the table as a product of proportion and diagonal row or column sum matrices. The row and column sum invarian...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6814214/ https://www.ncbi.nlm.nih.gov/pubmed/31652283 http://dx.doi.org/10.1371/journal.pone.0224460 |
_version_ | 1783462972314615808 |
---|---|
author | Luck, Stanley |
author_facet | Luck, Stanley |
author_sort | Luck, Stanley |
collection | PubMed |
description | We show that a two-component proportional representation provides the necessary framework to account for the properties of a 2 × 2 contingency table. This corresponds to the factorization of the table as a product of proportion and diagonal row or column sum matrices. The row and column sum invariant measures for proportional variation are obtained. Geometrically, these correspond to displacements of two point vectors in the standard one-simplex, which are reduced to a center-of-mass coordinate representation, [Image: see text] . Then, effect size measures, such as the odds ratio and relative risk, correspond to different perspective functions for the mapping of (δ, μ) to [Image: see text] . Furthermore, variations in δ and μ will be associated with different cost-benefit trade-offs for a given application. Therefore, pure mathematics alone does not provide the specification of a general form for the perspective function. This implies that the question of the merits of the odds ratio versus relative risk cannot be resolved in a general way. Expressions are obtained for the marginal sum dependence and the relations between various effect size measures, including the simple matching coefficient, odds ratio, relative risk, Yule’s Q, ϕ, and Goodman and Kruskal’s τ(c|r). We also show that Gini information gain (IG(G)) is equivalent to ϕ(2) in the classification and regression tree (CART) algorithm. Then, IG(G) can yield misleading results due to the dependence on marginal sums. Monte Carlo methods facilitate the detailed specification of stochastic effects in the data acquisition process and provide a practical way to estimate the confidence interval for an effect size. |
format | Online Article Text |
id | pubmed-6814214 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-68142142019-11-03 Factoring a 2 x 2 contingency table Luck, Stanley PLoS One Research Article We show that a two-component proportional representation provides the necessary framework to account for the properties of a 2 × 2 contingency table. This corresponds to the factorization of the table as a product of proportion and diagonal row or column sum matrices. The row and column sum invariant measures for proportional variation are obtained. Geometrically, these correspond to displacements of two point vectors in the standard one-simplex, which are reduced to a center-of-mass coordinate representation, [Image: see text] . Then, effect size measures, such as the odds ratio and relative risk, correspond to different perspective functions for the mapping of (δ, μ) to [Image: see text] . Furthermore, variations in δ and μ will be associated with different cost-benefit trade-offs for a given application. Therefore, pure mathematics alone does not provide the specification of a general form for the perspective function. This implies that the question of the merits of the odds ratio versus relative risk cannot be resolved in a general way. Expressions are obtained for the marginal sum dependence and the relations between various effect size measures, including the simple matching coefficient, odds ratio, relative risk, Yule’s Q, ϕ, and Goodman and Kruskal’s τ(c|r). We also show that Gini information gain (IG(G)) is equivalent to ϕ(2) in the classification and regression tree (CART) algorithm. Then, IG(G) can yield misleading results due to the dependence on marginal sums. Monte Carlo methods facilitate the detailed specification of stochastic effects in the data acquisition process and provide a practical way to estimate the confidence interval for an effect size. Public Library of Science 2019-10-25 /pmc/articles/PMC6814214/ /pubmed/31652283 http://dx.doi.org/10.1371/journal.pone.0224460 Text en © 2019 Stanley Luck http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Luck, Stanley Factoring a 2 x 2 contingency table |
title | Factoring a 2 x 2 contingency table |
title_full | Factoring a 2 x 2 contingency table |
title_fullStr | Factoring a 2 x 2 contingency table |
title_full_unstemmed | Factoring a 2 x 2 contingency table |
title_short | Factoring a 2 x 2 contingency table |
title_sort | factoring a 2 x 2 contingency table |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6814214/ https://www.ncbi.nlm.nih.gov/pubmed/31652283 http://dx.doi.org/10.1371/journal.pone.0224460 |
work_keys_str_mv | AT luckstanley factoringa2x2contingencytable |