Cargando…
Multivariate Welch t-test on distances
Motivation: Permutational non-Euclidean analysis of variance, PERMANOVA, is routinely used in exploratory analysis of multivariate datasets to draw conclusions about the significance of patterns visualized through dimension reduction. This method recognizes that pairwise distance matrix between obse...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5181538/ https://www.ncbi.nlm.nih.gov/pubmed/27515741 http://dx.doi.org/10.1093/bioinformatics/btw524 |
Sumario: | Motivation: Permutational non-Euclidean analysis of variance, PERMANOVA, is routinely used in exploratory analysis of multivariate datasets to draw conclusions about the significance of patterns visualized through dimension reduction. This method recognizes that pairwise distance matrix between observations is sufficient to compute within and between group sums of squares necessary to form the (pseudo) F statistic. Moreover, not only Euclidean, but arbitrary distances can be used. This method, however, suffers from loss of power and type I error inflation in the presence of heteroscedasticity and sample size imbalances. Results: We develop a solution in the form of a distance-based Welch t-test, [Formula: see text] , for two sample potentially unbalanced and heteroscedastic data. We demonstrate empirically the desirable type I error and power characteristics of the new test. We compare the performance of PERMANOVA and [Formula: see text] in reanalysis of two existing microbiome datasets, where the methodology has originated. Availability and Implementation: The source code for methods and analysis of this article is available at https://github.com/alekseyenko/Tw2. Further guidance on application of these methods can be obtained from the author. Contact: alekseye@musc.edu |
---|