Cargando…
Moving beyond the classic difference-in-differences model: a simulation study comparing statistical methods for estimating effectiveness of state-level policies
BACKGROUND: Reliable evaluations of state-level policies are essential for identifying effective policies and informing policymakers’ decisions. State-level policy evaluations commonly use a difference-in-differences (DID) study design; yet within this framework, statistical model specification vari...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8666265/ https://www.ncbi.nlm.nih.gov/pubmed/34895172 http://dx.doi.org/10.1186/s12874-021-01471-y |
_version_ | 1784614170943029248 |
---|---|
author | Griffin, Beth Ann Schuler, Megan S. Stuart, Elizabeth A. Patrick, Stephen McNeer, Elizabeth Smart, Rosanna Powell, David Stein, Bradley D. Schell, Terry L. Pacula, Rosalie Liccardo |
author_facet | Griffin, Beth Ann Schuler, Megan S. Stuart, Elizabeth A. Patrick, Stephen McNeer, Elizabeth Smart, Rosanna Powell, David Stein, Bradley D. Schell, Terry L. Pacula, Rosalie Liccardo |
author_sort | Griffin, Beth Ann |
collection | PubMed |
description | BACKGROUND: Reliable evaluations of state-level policies are essential for identifying effective policies and informing policymakers’ decisions. State-level policy evaluations commonly use a difference-in-differences (DID) study design; yet within this framework, statistical model specification varies notably across studies. More guidance is needed about which set of statistical models perform best when estimating how state-level policies affect outcomes. METHODS: Motivated by applied state-level opioid policy evaluations, we implemented an extensive simulation study to compare the statistical performance of multiple variations of the two-way fixed effect models traditionally used for DID under a range of simulation conditions. We also explored the performance of autoregressive (AR) and GEE models. We simulated policy effects on annual state-level opioid mortality rates and assessed statistical performance using various metrics, including directional bias, magnitude bias, and root mean squared error. We also reported Type I error rates and the rate of correctly rejecting the null hypothesis (e.g., power), given the prevalence of frequentist null hypothesis significance testing in the applied literature. RESULTS: Most linear models resulted in minimal bias. However, non-linear models and population-weighted versions of classic linear two-way fixed effect and linear GEE models yielded considerable bias (60 to 160%). Further, root mean square error was minimized by linear AR models when we examined crude mortality rates and by negative binomial models when we examined raw death counts. In the context of frequentist hypothesis testing, many models yielded high Type I error rates and very low rates of correctly rejecting the null hypothesis (< 10%), raising concerns of spurious conclusions about policy effectiveness in the opioid literature. When considering performance across models, the linear AR models were optimal in terms of directional bias, root mean squared error, Type I error, and correct rejection rates. CONCLUSIONS: The findings highlight notable limitations of commonly used statistical models for DID designs, which are widely used in opioid policy studies and in state policy evaluations more broadly. In contrast, the optimal model we identified--the AR model--is rarely used in state policy evaluation. We urge applied researchers to move beyond the classic DID paradigm and adopt use of AR models. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12874-021-01471-y. |
format | Online Article Text |
id | pubmed-8666265 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-86662652021-12-13 Moving beyond the classic difference-in-differences model: a simulation study comparing statistical methods for estimating effectiveness of state-level policies Griffin, Beth Ann Schuler, Megan S. Stuart, Elizabeth A. Patrick, Stephen McNeer, Elizabeth Smart, Rosanna Powell, David Stein, Bradley D. Schell, Terry L. Pacula, Rosalie Liccardo BMC Med Res Methodol Research BACKGROUND: Reliable evaluations of state-level policies are essential for identifying effective policies and informing policymakers’ decisions. State-level policy evaluations commonly use a difference-in-differences (DID) study design; yet within this framework, statistical model specification varies notably across studies. More guidance is needed about which set of statistical models perform best when estimating how state-level policies affect outcomes. METHODS: Motivated by applied state-level opioid policy evaluations, we implemented an extensive simulation study to compare the statistical performance of multiple variations of the two-way fixed effect models traditionally used for DID under a range of simulation conditions. We also explored the performance of autoregressive (AR) and GEE models. We simulated policy effects on annual state-level opioid mortality rates and assessed statistical performance using various metrics, including directional bias, magnitude bias, and root mean squared error. We also reported Type I error rates and the rate of correctly rejecting the null hypothesis (e.g., power), given the prevalence of frequentist null hypothesis significance testing in the applied literature. RESULTS: Most linear models resulted in minimal bias. However, non-linear models and population-weighted versions of classic linear two-way fixed effect and linear GEE models yielded considerable bias (60 to 160%). Further, root mean square error was minimized by linear AR models when we examined crude mortality rates and by negative binomial models when we examined raw death counts. In the context of frequentist hypothesis testing, many models yielded high Type I error rates and very low rates of correctly rejecting the null hypothesis (< 10%), raising concerns of spurious conclusions about policy effectiveness in the opioid literature. When considering performance across models, the linear AR models were optimal in terms of directional bias, root mean squared error, Type I error, and correct rejection rates. CONCLUSIONS: The findings highlight notable limitations of commonly used statistical models for DID designs, which are widely used in opioid policy studies and in state policy evaluations more broadly. In contrast, the optimal model we identified--the AR model--is rarely used in state policy evaluation. We urge applied researchers to move beyond the classic DID paradigm and adopt use of AR models. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12874-021-01471-y. BioMed Central 2021-12-13 /pmc/articles/PMC8666265/ /pubmed/34895172 http://dx.doi.org/10.1186/s12874-021-01471-y Text en © The Author(s) 2021, corrected publication 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Griffin, Beth Ann Schuler, Megan S. Stuart, Elizabeth A. Patrick, Stephen McNeer, Elizabeth Smart, Rosanna Powell, David Stein, Bradley D. Schell, Terry L. Pacula, Rosalie Liccardo Moving beyond the classic difference-in-differences model: a simulation study comparing statistical methods for estimating effectiveness of state-level policies |
title | Moving beyond the classic difference-in-differences model: a simulation study comparing statistical methods for estimating effectiveness of state-level policies |
title_full | Moving beyond the classic difference-in-differences model: a simulation study comparing statistical methods for estimating effectiveness of state-level policies |
title_fullStr | Moving beyond the classic difference-in-differences model: a simulation study comparing statistical methods for estimating effectiveness of state-level policies |
title_full_unstemmed | Moving beyond the classic difference-in-differences model: a simulation study comparing statistical methods for estimating effectiveness of state-level policies |
title_short | Moving beyond the classic difference-in-differences model: a simulation study comparing statistical methods for estimating effectiveness of state-level policies |
title_sort | moving beyond the classic difference-in-differences model: a simulation study comparing statistical methods for estimating effectiveness of state-level policies |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8666265/ https://www.ncbi.nlm.nih.gov/pubmed/34895172 http://dx.doi.org/10.1186/s12874-021-01471-y |
work_keys_str_mv | AT griffinbethann movingbeyondtheclassicdifferenceindifferencesmodelasimulationstudycomparingstatisticalmethodsforestimatingeffectivenessofstatelevelpolicies AT schulermegans movingbeyondtheclassicdifferenceindifferencesmodelasimulationstudycomparingstatisticalmethodsforestimatingeffectivenessofstatelevelpolicies AT stuartelizabetha movingbeyondtheclassicdifferenceindifferencesmodelasimulationstudycomparingstatisticalmethodsforestimatingeffectivenessofstatelevelpolicies AT patrickstephen movingbeyondtheclassicdifferenceindifferencesmodelasimulationstudycomparingstatisticalmethodsforestimatingeffectivenessofstatelevelpolicies AT mcneerelizabeth movingbeyondtheclassicdifferenceindifferencesmodelasimulationstudycomparingstatisticalmethodsforestimatingeffectivenessofstatelevelpolicies AT smartrosanna movingbeyondtheclassicdifferenceindifferencesmodelasimulationstudycomparingstatisticalmethodsforestimatingeffectivenessofstatelevelpolicies AT powelldavid movingbeyondtheclassicdifferenceindifferencesmodelasimulationstudycomparingstatisticalmethodsforestimatingeffectivenessofstatelevelpolicies AT steinbradleyd movingbeyondtheclassicdifferenceindifferencesmodelasimulationstudycomparingstatisticalmethodsforestimatingeffectivenessofstatelevelpolicies AT schellterryl movingbeyondtheclassicdifferenceindifferencesmodelasimulationstudycomparingstatisticalmethodsforestimatingeffectivenessofstatelevelpolicies AT pacularosalieliccardo movingbeyondtheclassicdifferenceindifferencesmodelasimulationstudycomparingstatisticalmethodsforestimatingeffectivenessofstatelevelpolicies |