Cargando…

Case for omitting tied observations in the two-sample t-test and the Wilcoxon-Mann-Whitney Test

When the distributional assumptions for a t-test are not met, the default position of many analysts is to resort to a rank-based test, such as the Wilcoxon-Mann-Whitney Test to compare the difference in means between two samples. The Wilcoxon-Mann-Whitney Test presents no danger of tied observations...

Descripción completa

Detalles Bibliográficos
Autor principal:	McGee, Monnie
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2018
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6057651/ https://www.ncbi.nlm.nih.gov/pubmed/30040850 http://dx.doi.org/10.1371/journal.pone.0200837

_version_	1783341567665242112
author	McGee, Monnie
author_facet	McGee, Monnie
author_sort	McGee, Monnie
collection	PubMed
description	When the distributional assumptions for a t-test are not met, the default position of many analysts is to resort to a rank-based test, such as the Wilcoxon-Mann-Whitney Test to compare the difference in means between two samples. The Wilcoxon-Mann-Whitney Test presents no danger of tied observations when the observations in the data are continuous. However, in practice, observations are discretized due various logical reasons, or the data are ordinal in nature. When ranks are tied, most textbooks recommend using mid-ranks to replace the tied ranks, a practice that affects the distribution of the Wilcoxon-Mann-Whitney Test under the null hypothesis. Other methods for breaking ties have also been proposed. In this study, we examine four tie-breaking methods—average-scores, mid-ranks, jittering, and omission—for their effects on Type I and Type II error of the Wilcoxon-Mann-Whitney Test and the two-sample t-test for various combinations of sample sizes, underlying population distributions, and percentages of tied observations. We use the results to determine the maximum percentage of ties for which the power and size are seriously affected, and for which method of tie-breaking results in the best Type I and Type II error properties. Not surprisingly, the underlying population distribution of the data has less of an effect on the Wilcoxon-Mann-Whitney Test than on the t-test. Surprisingly, we find that the jittering and omission methods tend to hold Type I error at the nominal level, even for small sample sizes, with no substantial sacrifice in terms of Type II error. Furthermore, the t-test and the Wilcoxon-Mann-Whitney Test are equally effected by ties in terms of Type I and Type II error; therefore, we recommend omitting tied observations when they occur for both the two-sample t-test and the Wilcoxon-Mann-Whitney due to the bias in Type I error that is created when tied observations are left in the data, in the case of the t-test, or adjusted using mid-ranks or average-scores, in the case of the Wilcoxon-Mann-Whitney.
format	Online Article Text
id	pubmed-6057651
institution	National Center for Biotechnology Information
language	English
publishDate	2018
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-60576512018-08-06 Case for omitting tied observations in the two-sample t-test and the Wilcoxon-Mann-Whitney Test McGee, Monnie PLoS One Research Article When the distributional assumptions for a t-test are not met, the default position of many analysts is to resort to a rank-based test, such as the Wilcoxon-Mann-Whitney Test to compare the difference in means between two samples. The Wilcoxon-Mann-Whitney Test presents no danger of tied observations when the observations in the data are continuous. However, in practice, observations are discretized due various logical reasons, or the data are ordinal in nature. When ranks are tied, most textbooks recommend using mid-ranks to replace the tied ranks, a practice that affects the distribution of the Wilcoxon-Mann-Whitney Test under the null hypothesis. Other methods for breaking ties have also been proposed. In this study, we examine four tie-breaking methods—average-scores, mid-ranks, jittering, and omission—for their effects on Type I and Type II error of the Wilcoxon-Mann-Whitney Test and the two-sample t-test for various combinations of sample sizes, underlying population distributions, and percentages of tied observations. We use the results to determine the maximum percentage of ties for which the power and size are seriously affected, and for which method of tie-breaking results in the best Type I and Type II error properties. Not surprisingly, the underlying population distribution of the data has less of an effect on the Wilcoxon-Mann-Whitney Test than on the t-test. Surprisingly, we find that the jittering and omission methods tend to hold Type I error at the nominal level, even for small sample sizes, with no substantial sacrifice in terms of Type II error. Furthermore, the t-test and the Wilcoxon-Mann-Whitney Test are equally effected by ties in terms of Type I and Type II error; therefore, we recommend omitting tied observations when they occur for both the two-sample t-test and the Wilcoxon-Mann-Whitney due to the bias in Type I error that is created when tied observations are left in the data, in the case of the t-test, or adjusted using mid-ranks or average-scores, in the case of the Wilcoxon-Mann-Whitney. Public Library of Science 2018-07-24 /pmc/articles/PMC6057651/ /pubmed/30040850 http://dx.doi.org/10.1371/journal.pone.0200837 Text en © 2018 Monnie McGee http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Research Article McGee, Monnie Case for omitting tied observations in the two-sample t-test and the Wilcoxon-Mann-Whitney Test
title	Case for omitting tied observations in the two-sample t-test and the Wilcoxon-Mann-Whitney Test
title_full	Case for omitting tied observations in the two-sample t-test and the Wilcoxon-Mann-Whitney Test
title_fullStr	Case for omitting tied observations in the two-sample t-test and the Wilcoxon-Mann-Whitney Test
title_full_unstemmed	Case for omitting tied observations in the two-sample t-test and the Wilcoxon-Mann-Whitney Test
title_short	Case for omitting tied observations in the two-sample t-test and the Wilcoxon-Mann-Whitney Test
title_sort	case for omitting tied observations in the two-sample t-test and the wilcoxon-mann-whitney test
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6057651/ https://www.ncbi.nlm.nih.gov/pubmed/30040850 http://dx.doi.org/10.1371/journal.pone.0200837
work_keys_str_mv	AT mcgeemonnie caseforomittingtiedobservationsinthetwosamplettestandthewilcoxonmannwhitneytest

Case for omitting tied observations in the two-sample t-test and the Wilcoxon-Mann-Whitney Test

Ejemplares similares