Cargando…

Analysis of Bayesian posterior significance and effect size indices for the two-sample t-test to support reproducible medical research

BACKGROUND: The replication crisis hit the medical sciences about a decade ago, but today still most of the flaws inherent in null hypothesis significance testing (NHST) have not been solved. While the drawbacks of p-values have been detailed in endless venues, for clinical research, only a few attr...

Descripción completa

Detalles Bibliográficos
Autor principal: Kelter, Riko
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7178740/
https://www.ncbi.nlm.nih.gov/pubmed/32321438
http://dx.doi.org/10.1186/s12874-020-00968-2
_version_ 1783525526030254080
author Kelter, Riko
author_facet Kelter, Riko
author_sort Kelter, Riko
collection PubMed
description BACKGROUND: The replication crisis hit the medical sciences about a decade ago, but today still most of the flaws inherent in null hypothesis significance testing (NHST) have not been solved. While the drawbacks of p-values have been detailed in endless venues, for clinical research, only a few attractive alternatives have been proposed to replace p-values and NHST. Bayesian methods are one of them, and they are gaining increasing attention in medical research, as some of their advantages include the description of model parameters in terms of probability, as well as the incorporation of prior information in contrast to the frequentist framework. While Bayesian methods are not the only remedy to the situation, there is an increasing agreement that they are an essential way to avoid common misconceptions and false interpretation of study results. The requirements necessary for applying Bayesian statistics have transitioned from detailed programming knowledge into simple point-and-click programs like JASP. Still, the multitude of Bayesian significance and effect measures which contrast the gold standard of significance in medical research, the p-value, causes a lack of agreement on which measure to report. METHODS: Therefore, in this paper, we conduct an extensive simulation study to compare common Bayesian significance and effect measures which can be obtained from a posterior distribution. In it, we analyse the behaviour of these measures for one of the most important statistical procedures in medical research and in particular clinical trials, the two-sample Student’s (and Welch’s) t-test. RESULTS: The results show that some measures cannot state evidence for both the null and the alternative. While the different indices behave similarly regarding increasing sample size and noise, the prior modelling influences the obtained results and extreme priors allow for cherry-picking similar to p-hacking in the frequentist paradigm. The indices behave quite differently regarding their ability to control the type I error rates and regarding their ability to detect an existing effect. CONCLUSION: Based on the results, two of the commonly used indices can be recommended for more widespread use in clinical and biomedical research, as they improve the type I error control compared to the classic two-sample t-test and enjoy multiple other desirable properties.
format Online
Article
Text
id pubmed-7178740
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-71787402020-04-26 Analysis of Bayesian posterior significance and effect size indices for the two-sample t-test to support reproducible medical research Kelter, Riko BMC Med Res Methodol Research Article BACKGROUND: The replication crisis hit the medical sciences about a decade ago, but today still most of the flaws inherent in null hypothesis significance testing (NHST) have not been solved. While the drawbacks of p-values have been detailed in endless venues, for clinical research, only a few attractive alternatives have been proposed to replace p-values and NHST. Bayesian methods are one of them, and they are gaining increasing attention in medical research, as some of their advantages include the description of model parameters in terms of probability, as well as the incorporation of prior information in contrast to the frequentist framework. While Bayesian methods are not the only remedy to the situation, there is an increasing agreement that they are an essential way to avoid common misconceptions and false interpretation of study results. The requirements necessary for applying Bayesian statistics have transitioned from detailed programming knowledge into simple point-and-click programs like JASP. Still, the multitude of Bayesian significance and effect measures which contrast the gold standard of significance in medical research, the p-value, causes a lack of agreement on which measure to report. METHODS: Therefore, in this paper, we conduct an extensive simulation study to compare common Bayesian significance and effect measures which can be obtained from a posterior distribution. In it, we analyse the behaviour of these measures for one of the most important statistical procedures in medical research and in particular clinical trials, the two-sample Student’s (and Welch’s) t-test. RESULTS: The results show that some measures cannot state evidence for both the null and the alternative. While the different indices behave similarly regarding increasing sample size and noise, the prior modelling influences the obtained results and extreme priors allow for cherry-picking similar to p-hacking in the frequentist paradigm. The indices behave quite differently regarding their ability to control the type I error rates and regarding their ability to detect an existing effect. CONCLUSION: Based on the results, two of the commonly used indices can be recommended for more widespread use in clinical and biomedical research, as they improve the type I error control compared to the classic two-sample t-test and enjoy multiple other desirable properties. BioMed Central 2020-04-22 /pmc/articles/PMC7178740/ /pubmed/32321438 http://dx.doi.org/10.1186/s12874-020-00968-2 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research Article
Kelter, Riko
Analysis of Bayesian posterior significance and effect size indices for the two-sample t-test to support reproducible medical research
title Analysis of Bayesian posterior significance and effect size indices for the two-sample t-test to support reproducible medical research
title_full Analysis of Bayesian posterior significance and effect size indices for the two-sample t-test to support reproducible medical research
title_fullStr Analysis of Bayesian posterior significance and effect size indices for the two-sample t-test to support reproducible medical research
title_full_unstemmed Analysis of Bayesian posterior significance and effect size indices for the two-sample t-test to support reproducible medical research
title_short Analysis of Bayesian posterior significance and effect size indices for the two-sample t-test to support reproducible medical research
title_sort analysis of bayesian posterior significance and effect size indices for the two-sample t-test to support reproducible medical research
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7178740/
https://www.ncbi.nlm.nih.gov/pubmed/32321438
http://dx.doi.org/10.1186/s12874-020-00968-2
work_keys_str_mv AT kelterriko analysisofbayesianposteriorsignificanceandeffectsizeindicesforthetwosamplettesttosupportreproduciblemedicalresearch