Cargando…

Within study comparisons and risk of bias in international development: Systematic review and critical appraisal

BACKGROUND: Many systematic reviews incorporate nonrandomised studies of effects, sometimes called quasi‐experiments or natural experiments. However, the extent to which nonrandomised studies produce unbiased effect estimates is unclear in expectation or in practice. The usual way that systematic re...

Descripción completa

Detalles Bibliográficos
Autores principales: Villar, Paul Fenton, Waddington, Hugh
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8356524/
https://www.ncbi.nlm.nih.gov/pubmed/37131472
http://dx.doi.org/10.1002/cl2.1027
Descripción
Sumario:BACKGROUND: Many systematic reviews incorporate nonrandomised studies of effects, sometimes called quasi‐experiments or natural experiments. However, the extent to which nonrandomised studies produce unbiased effect estimates is unclear in expectation or in practice. The usual way that systematic reviews quantify bias is through “risk of bias assessment” and indirect comparison of findings across studies using meta‐analysis. A more direct, practical way to quantify the bias in nonrandomised studies is through “internal replication research”, which compares the findings from nonrandomised studies with estimates from a benchmark randomised controlled trial conducted in the same population. Despite the existence of many risks of bias tools, none are conceptualised to assess comprehensively nonrandomised approaches with selection on unobservables, such as regression discontinuity designs (RDDs). The few that are conceptualised with these studies in mind do not draw on the extensive literature on internal replications (within‐study comparisons) of randomised trials. OBJECTIVES: Our research objectives were as follows: Objective 1: to undertake a systematic review of nonrandomised internal study replications of international development interventions. Objective 2: to develop a risk of bias tool for RDDs, an increasingly common method used in social and economic programme evaluation. METHODS: We used the following methods to achieve our objectives. Objective 1: we searched systematically for nonrandomised internal study replications of benchmark randomised experiments of social and economic interventions in low‐ and middle‐income countries (L&MICs). We assessed the risk of bias in benchmark randomised experiments and synthesised evidence on the relative bias effect sizes produced by benchmark and nonrandomised comparison arms. Objective 2: We used document review and expert consultation to develop further a risk of bias tool for quasi‐experimental studies of interventions (ROBINS‐I) for RDDs. RESULTS: Objective 1: we located 10 nonrandomised internal study replications of randomised trials in L&MICs, six of which are of RDDs and the remaining use a combination of statistical matching and regression techniques. We found that benchmark experiments used in internal replications in international development are in the main well‐conducted but have “some concerns” about threats to validity, usually arising due to the methods of outcomes data collection. Most internal replication studies report on a range of different specifications for both the benchmark estimate and the nonrandomised replication estimate. We extracted and standardised 604 bias coefficient effect sizes from these studies, and present average results narratively. Objective 2: RDDs are characterised by prospective assignment of participants based on a threshold variable. Our review of the literature indicated there are two main types of RDD. The most common type of RDD is designed retrospectively in which the researcher identifies post‐hoc the relationship between outcomes and a threshold variable which determines assignment to intervention at pretest. These designs usually draw on routine data collection such as administrative records or household surveys. The other, less common, type is a prospective design where the researcher is also involved in allocating participants to treatment groups from the outset. We developed a risk of bias tool for RDDs. CONCLUSIONS: Internal study replications provide the grounds on which bias assessment tools can be evidenced. We conclude that existing risk of bias tools needs to be further developed for use by Campbell collaboration authors, and there is a wide range of risk of bias tools and internal study replications to draw on in better designing these tools. We have suggested the development of a promising approach for RDD. Further work is needed on common methodologies in programme evaluation, for example on statistical matching approaches. We also highlight that broader efforts to identify all existing internal replication studies should consider more specialised systematic search strategies within particular literatures; so as to overcome a lack of systematic indexing of this evidence.