Cargando…

Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse

Fitting generalised linear models (GLMs) with more than one predictor has become the standard method of analysis in evolutionary and behavioural research. Often, GLMs are used for exploratory data analysis, where one starts with a complex full model including interaction terms and then simplifies by...

Descripción completa

Detalles Bibliográficos
Autores principales:	Forstmeier, Wolfgang, Schielzeth, Holger
Formato:	Texto
Lenguaje:	English
Publicado:	Springer-Verlag 2010
Materias:	Original Paper
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3015194/ https://www.ncbi.nlm.nih.gov/pubmed/21297852 http://dx.doi.org/10.1007/s00265-010-1038-5

_version_	1782195457552809984
author	Forstmeier, Wolfgang Schielzeth, Holger
author_facet	Forstmeier, Wolfgang Schielzeth, Holger
author_sort	Forstmeier, Wolfgang
collection	PubMed
description	Fitting generalised linear models (GLMs) with more than one predictor has become the standard method of analysis in evolutionary and behavioural research. Often, GLMs are used for exploratory data analysis, where one starts with a complex full model including interaction terms and then simplifies by removing non-significant terms. While this approach can be useful, it is problematic if significant effects are interpreted as if they arose from a single a priori hypothesis test. This is because model selection involves cryptic multiple hypothesis testing, a fact that has only rarely been acknowledged or quantified. We show that the probability of finding at least one ‘significant’ effect is high, even if all null hypotheses are true (e.g. 40% when starting with four predictors and their two-way interactions). This probability is close to theoretical expectations when the sample size (N) is large relative to the number of predictors including interactions (k). In contrast, type I error rates strongly exceed even those expectations when model simplification is applied to models that are over-fitted before simplification (low N/k ratio). The increase in false-positive results arises primarily from an overestimation of effect sizes among significant predictors, leading to upward-biased effect sizes that often cannot be reproduced in follow-up studies (‘the winner's curse’). Despite having their own problems, full model tests and P value adjustments can be used as a guide to how frequently type I errors arise by sampling variation alone. We favour the presentation of full models, since they best reflect the range of predictors investigated and ensure a balanced representation also of non-significant results.
format	Text
id	pubmed-3015194
institution	National Center for Biotechnology Information
language	English
publishDate	2010
publisher	Springer-Verlag
record_format	MEDLINE/PubMed
spelling	pubmed-30151942011-02-04 Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse Forstmeier, Wolfgang Schielzeth, Holger Behav Ecol Sociobiol Original Paper Fitting generalised linear models (GLMs) with more than one predictor has become the standard method of analysis in evolutionary and behavioural research. Often, GLMs are used for exploratory data analysis, where one starts with a complex full model including interaction terms and then simplifies by removing non-significant terms. While this approach can be useful, it is problematic if significant effects are interpreted as if they arose from a single a priori hypothesis test. This is because model selection involves cryptic multiple hypothesis testing, a fact that has only rarely been acknowledged or quantified. We show that the probability of finding at least one ‘significant’ effect is high, even if all null hypotheses are true (e.g. 40% when starting with four predictors and their two-way interactions). This probability is close to theoretical expectations when the sample size (N) is large relative to the number of predictors including interactions (k). In contrast, type I error rates strongly exceed even those expectations when model simplification is applied to models that are over-fitted before simplification (low N/k ratio). The increase in false-positive results arises primarily from an overestimation of effect sizes among significant predictors, leading to upward-biased effect sizes that often cannot be reproduced in follow-up studies (‘the winner's curse’). Despite having their own problems, full model tests and P value adjustments can be used as a guide to how frequently type I errors arise by sampling variation alone. We favour the presentation of full models, since they best reflect the range of predictors investigated and ensure a balanced representation also of non-significant results. Springer-Verlag 2010-08-19 2011 /pmc/articles/PMC3015194/ /pubmed/21297852 http://dx.doi.org/10.1007/s00265-010-1038-5 Text en © The Author(s) 2010 https://creativecommons.org/licenses/by-nc/4.0/ This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
spellingShingle	Original Paper Forstmeier, Wolfgang Schielzeth, Holger Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse
title	Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse
title_full	Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse
title_fullStr	Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse
title_full_unstemmed	Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse
title_short	Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse
title_sort	cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse
topic	Original Paper
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3015194/ https://www.ncbi.nlm.nih.gov/pubmed/21297852 http://dx.doi.org/10.1007/s00265-010-1038-5
work_keys_str_mv	AT forstmeierwolfgang crypticmultiplehypothesestestinginlinearmodelsoverestimatedeffectsizesandthewinnerscurse AT schielzethholger crypticmultiplehypothesestestinginlinearmodelsoverestimatedeffectsizesandthewinnerscurse

Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse

Ejemplares similares