Cargando…

Moving beyond the classic difference-in-differences model: a simulation study comparing statistical methods for estimating effectiveness of state-level policies

BACKGROUND: Reliable evaluations of state-level policies are essential for identifying effective policies and informing policymakers’ decisions. State-level policy evaluations commonly use a difference-in-differences (DID) study design; yet within this framework, statistical model specification vari...

Descripción completa

Detalles Bibliográficos
Autores principales: Griffin, Beth Ann, Schuler, Megan S., Stuart, Elizabeth A., Patrick, Stephen, McNeer, Elizabeth, Smart, Rosanna, Powell, David, Stein, Bradley D., Schell, Terry L., Pacula, Rosalie Liccardo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8666265/
https://www.ncbi.nlm.nih.gov/pubmed/34895172
http://dx.doi.org/10.1186/s12874-021-01471-y
_version_ 1784614170943029248
author Griffin, Beth Ann
Schuler, Megan S.
Stuart, Elizabeth A.
Patrick, Stephen
McNeer, Elizabeth
Smart, Rosanna
Powell, David
Stein, Bradley D.
Schell, Terry L.
Pacula, Rosalie Liccardo
author_facet Griffin, Beth Ann
Schuler, Megan S.
Stuart, Elizabeth A.
Patrick, Stephen
McNeer, Elizabeth
Smart, Rosanna
Powell, David
Stein, Bradley D.
Schell, Terry L.
Pacula, Rosalie Liccardo
author_sort Griffin, Beth Ann
collection PubMed
description BACKGROUND: Reliable evaluations of state-level policies are essential for identifying effective policies and informing policymakers’ decisions. State-level policy evaluations commonly use a difference-in-differences (DID) study design; yet within this framework, statistical model specification varies notably across studies. More guidance is needed about which set of statistical models perform best when estimating how state-level policies affect outcomes. METHODS: Motivated by applied state-level opioid policy evaluations, we implemented an extensive simulation study to compare the statistical performance of multiple variations of the two-way fixed effect models traditionally used for DID under a range of simulation conditions. We also explored the performance of autoregressive (AR) and GEE models. We simulated policy effects on annual state-level opioid mortality rates and assessed statistical performance using various metrics, including directional bias, magnitude bias, and root mean squared error. We also reported Type I error rates and the rate of correctly rejecting the null hypothesis (e.g., power), given the prevalence of frequentist null hypothesis significance testing in the applied literature. RESULTS: Most linear models resulted in minimal bias. However, non-linear models and population-weighted versions of classic linear two-way fixed effect and linear GEE models yielded considerable bias (60 to 160%). Further, root mean square error was minimized by linear AR models when we examined crude mortality rates and by negative binomial models when we examined raw death counts. In the context of frequentist hypothesis testing, many models yielded high Type I error rates and very low rates of correctly rejecting the null hypothesis (< 10%), raising concerns of spurious conclusions about policy effectiveness in the opioid literature. When considering performance across models, the linear AR models were optimal in terms of directional bias, root mean squared error, Type I error, and correct rejection rates. CONCLUSIONS: The findings highlight notable limitations of commonly used statistical models for DID designs, which are widely used in opioid policy studies and in state policy evaluations more broadly. In contrast, the optimal model we identified--the AR model--is rarely used in state policy evaluation. We urge applied researchers to move beyond the classic DID paradigm and adopt use of AR models. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12874-021-01471-y.
format Online
Article
Text
id pubmed-8666265
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-86662652021-12-13 Moving beyond the classic difference-in-differences model: a simulation study comparing statistical methods for estimating effectiveness of state-level policies Griffin, Beth Ann Schuler, Megan S. Stuart, Elizabeth A. Patrick, Stephen McNeer, Elizabeth Smart, Rosanna Powell, David Stein, Bradley D. Schell, Terry L. Pacula, Rosalie Liccardo BMC Med Res Methodol Research BACKGROUND: Reliable evaluations of state-level policies are essential for identifying effective policies and informing policymakers’ decisions. State-level policy evaluations commonly use a difference-in-differences (DID) study design; yet within this framework, statistical model specification varies notably across studies. More guidance is needed about which set of statistical models perform best when estimating how state-level policies affect outcomes. METHODS: Motivated by applied state-level opioid policy evaluations, we implemented an extensive simulation study to compare the statistical performance of multiple variations of the two-way fixed effect models traditionally used for DID under a range of simulation conditions. We also explored the performance of autoregressive (AR) and GEE models. We simulated policy effects on annual state-level opioid mortality rates and assessed statistical performance using various metrics, including directional bias, magnitude bias, and root mean squared error. We also reported Type I error rates and the rate of correctly rejecting the null hypothesis (e.g., power), given the prevalence of frequentist null hypothesis significance testing in the applied literature. RESULTS: Most linear models resulted in minimal bias. However, non-linear models and population-weighted versions of classic linear two-way fixed effect and linear GEE models yielded considerable bias (60 to 160%). Further, root mean square error was minimized by linear AR models when we examined crude mortality rates and by negative binomial models when we examined raw death counts. In the context of frequentist hypothesis testing, many models yielded high Type I error rates and very low rates of correctly rejecting the null hypothesis (< 10%), raising concerns of spurious conclusions about policy effectiveness in the opioid literature. When considering performance across models, the linear AR models were optimal in terms of directional bias, root mean squared error, Type I error, and correct rejection rates. CONCLUSIONS: The findings highlight notable limitations of commonly used statistical models for DID designs, which are widely used in opioid policy studies and in state policy evaluations more broadly. In contrast, the optimal model we identified--the AR model--is rarely used in state policy evaluation. We urge applied researchers to move beyond the classic DID paradigm and adopt use of AR models. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12874-021-01471-y. BioMed Central 2021-12-13 /pmc/articles/PMC8666265/ /pubmed/34895172 http://dx.doi.org/10.1186/s12874-021-01471-y Text en © The Author(s) 2021, corrected publication 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Griffin, Beth Ann
Schuler, Megan S.
Stuart, Elizabeth A.
Patrick, Stephen
McNeer, Elizabeth
Smart, Rosanna
Powell, David
Stein, Bradley D.
Schell, Terry L.
Pacula, Rosalie Liccardo
Moving beyond the classic difference-in-differences model: a simulation study comparing statistical methods for estimating effectiveness of state-level policies
title Moving beyond the classic difference-in-differences model: a simulation study comparing statistical methods for estimating effectiveness of state-level policies
title_full Moving beyond the classic difference-in-differences model: a simulation study comparing statistical methods for estimating effectiveness of state-level policies
title_fullStr Moving beyond the classic difference-in-differences model: a simulation study comparing statistical methods for estimating effectiveness of state-level policies
title_full_unstemmed Moving beyond the classic difference-in-differences model: a simulation study comparing statistical methods for estimating effectiveness of state-level policies
title_short Moving beyond the classic difference-in-differences model: a simulation study comparing statistical methods for estimating effectiveness of state-level policies
title_sort moving beyond the classic difference-in-differences model: a simulation study comparing statistical methods for estimating effectiveness of state-level policies
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8666265/
https://www.ncbi.nlm.nih.gov/pubmed/34895172
http://dx.doi.org/10.1186/s12874-021-01471-y
work_keys_str_mv AT griffinbethann movingbeyondtheclassicdifferenceindifferencesmodelasimulationstudycomparingstatisticalmethodsforestimatingeffectivenessofstatelevelpolicies
AT schulermegans movingbeyondtheclassicdifferenceindifferencesmodelasimulationstudycomparingstatisticalmethodsforestimatingeffectivenessofstatelevelpolicies
AT stuartelizabetha movingbeyondtheclassicdifferenceindifferencesmodelasimulationstudycomparingstatisticalmethodsforestimatingeffectivenessofstatelevelpolicies
AT patrickstephen movingbeyondtheclassicdifferenceindifferencesmodelasimulationstudycomparingstatisticalmethodsforestimatingeffectivenessofstatelevelpolicies
AT mcneerelizabeth movingbeyondtheclassicdifferenceindifferencesmodelasimulationstudycomparingstatisticalmethodsforestimatingeffectivenessofstatelevelpolicies
AT smartrosanna movingbeyondtheclassicdifferenceindifferencesmodelasimulationstudycomparingstatisticalmethodsforestimatingeffectivenessofstatelevelpolicies
AT powelldavid movingbeyondtheclassicdifferenceindifferencesmodelasimulationstudycomparingstatisticalmethodsforestimatingeffectivenessofstatelevelpolicies
AT steinbradleyd movingbeyondtheclassicdifferenceindifferencesmodelasimulationstudycomparingstatisticalmethodsforestimatingeffectivenessofstatelevelpolicies
AT schellterryl movingbeyondtheclassicdifferenceindifferencesmodelasimulationstudycomparingstatisticalmethodsforestimatingeffectivenessofstatelevelpolicies
AT pacularosalieliccardo movingbeyondtheclassicdifferenceindifferencesmodelasimulationstudycomparingstatisticalmethodsforestimatingeffectivenessofstatelevelpolicies