Cargando…

Why and how we should join the shift from significance testing to estimation

A paradigm shift away from null hypothesis significance testing seems in progress. Based on simulations, we illustrate some of the underlying motivations. First, p‐values vary strongly from study to study, hence dichotomous inference using significance thresholds is usually unjustified. Second, ‘sta...

Descripción completa

Detalles Bibliográficos
Autores principales: Berner, Daniel, Amrhein, Valentin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9322409/
https://www.ncbi.nlm.nih.gov/pubmed/35582935
http://dx.doi.org/10.1111/jeb.14009
_version_ 1784756296448212992
author Berner, Daniel
Amrhein, Valentin
author_facet Berner, Daniel
Amrhein, Valentin
author_sort Berner, Daniel
collection PubMed
description A paradigm shift away from null hypothesis significance testing seems in progress. Based on simulations, we illustrate some of the underlying motivations. First, p‐values vary strongly from study to study, hence dichotomous inference using significance thresholds is usually unjustified. Second, ‘statistically significant’ results have overestimated effect sizes, a bias declining with increasing statistical power. Third, ‘statistically non‐significant’ results have underestimated effect sizes, and this bias gets stronger with higher statistical power. Fourth, the tested statistical hypotheses usually lack biological justification and are often uninformative. Despite these problems, a screen of 48 papers from the 2020 volume of the Journal of Evolutionary Biology exemplifies that significance testing is still used almost universally in evolutionary biology. All screened studies tested default null hypotheses of zero effect with the default significance threshold of p = 0.05, none presented a pre‐specified alternative hypothesis, pre‐study power calculation and the probability of ‘false negatives’ (beta error rate). The results sections of the papers presented 49 significance tests on average (median 23, range 0–390). Of 41 studies that contained verbal descriptions of a ‘statistically non‐significant’ result, 26 (63%) falsely claimed the absence of an effect. We conclude that studies in ecology and evolutionary biology are mostly exploratory and descriptive. We should thus shift from claiming to ‘test’ specific hypotheses statistically to describing and discussing many hypotheses (possible true effect sizes) that are most compatible with our data, given our statistical model. We already have the means for doing so, because we routinely present compatibility (‘confidence’) intervals covering these hypotheses.
format Online
Article
Text
id pubmed-9322409
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-93224092022-07-30 Why and how we should join the shift from significance testing to estimation Berner, Daniel Amrhein, Valentin J Evol Biol Methods Article A paradigm shift away from null hypothesis significance testing seems in progress. Based on simulations, we illustrate some of the underlying motivations. First, p‐values vary strongly from study to study, hence dichotomous inference using significance thresholds is usually unjustified. Second, ‘statistically significant’ results have overestimated effect sizes, a bias declining with increasing statistical power. Third, ‘statistically non‐significant’ results have underestimated effect sizes, and this bias gets stronger with higher statistical power. Fourth, the tested statistical hypotheses usually lack biological justification and are often uninformative. Despite these problems, a screen of 48 papers from the 2020 volume of the Journal of Evolutionary Biology exemplifies that significance testing is still used almost universally in evolutionary biology. All screened studies tested default null hypotheses of zero effect with the default significance threshold of p = 0.05, none presented a pre‐specified alternative hypothesis, pre‐study power calculation and the probability of ‘false negatives’ (beta error rate). The results sections of the papers presented 49 significance tests on average (median 23, range 0–390). Of 41 studies that contained verbal descriptions of a ‘statistically non‐significant’ result, 26 (63%) falsely claimed the absence of an effect. We conclude that studies in ecology and evolutionary biology are mostly exploratory and descriptive. We should thus shift from claiming to ‘test’ specific hypotheses statistically to describing and discussing many hypotheses (possible true effect sizes) that are most compatible with our data, given our statistical model. We already have the means for doing so, because we routinely present compatibility (‘confidence’) intervals covering these hypotheses. John Wiley and Sons Inc. 2022-05-18 2022-06 /pmc/articles/PMC9322409/ /pubmed/35582935 http://dx.doi.org/10.1111/jeb.14009 Text en © 2022 The Authors. Journal of Evolutionary Biology published by John Wiley & Sons Ltd on behalf of European Society for Evolutionary Biology. https://creativecommons.org/licenses/by/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Article
Berner, Daniel
Amrhein, Valentin
Why and how we should join the shift from significance testing to estimation
title Why and how we should join the shift from significance testing to estimation
title_full Why and how we should join the shift from significance testing to estimation
title_fullStr Why and how we should join the shift from significance testing to estimation
title_full_unstemmed Why and how we should join the shift from significance testing to estimation
title_short Why and how we should join the shift from significance testing to estimation
title_sort why and how we should join the shift from significance testing to estimation
topic Methods Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9322409/
https://www.ncbi.nlm.nih.gov/pubmed/35582935
http://dx.doi.org/10.1111/jeb.14009
work_keys_str_mv AT bernerdaniel whyandhowweshouldjointheshiftfromsignificancetestingtoestimation
AT amrheinvalentin whyandhowweshouldjointheshiftfromsignificancetestingtoestimation