Cargando…
External control arm analysis: an evaluation of propensity score approaches, G-computation, and doubly debiased machine learning
BACKGROUND: An external control arm is a cohort of control patients that are collected from data external to a single-arm trial. To provide an unbiased estimation of efficacy, the clinical profiles of patients from single and external arms should be aligned, typically using propensity score approach...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9795588/ https://www.ncbi.nlm.nih.gov/pubmed/36577946 http://dx.doi.org/10.1186/s12874-022-01799-z |
_version_ | 1784860294636371968 |
---|---|
author | Loiseau, Nicolas Trichelair, Paul He, Maxime Andreux, Mathieu Zaslavskiy, Mikhail Wainrib, Gilles Blum, Michael G. B. |
author_facet | Loiseau, Nicolas Trichelair, Paul He, Maxime Andreux, Mathieu Zaslavskiy, Mikhail Wainrib, Gilles Blum, Michael G. B. |
author_sort | Loiseau, Nicolas |
collection | PubMed |
description | BACKGROUND: An external control arm is a cohort of control patients that are collected from data external to a single-arm trial. To provide an unbiased estimation of efficacy, the clinical profiles of patients from single and external arms should be aligned, typically using propensity score approaches. There are alternative approaches to infer efficacy based on comparisons between outcomes of single-arm patients and machine-learning predictions of control patient outcomes. These methods include G-computation and Doubly Debiased Machine Learning (DDML) and their evaluation for External Control Arms (ECA) analysis is insufficient. METHODS: We consider both numerical simulations and a trial replication procedure to evaluate the different statistical approaches: propensity score matching, Inverse Probability of Treatment Weighting (IPTW), G-computation, and DDML. The replication study relies on five type 2 diabetes randomized clinical trials granted by the Yale University Open Data Access (YODA) project. From the pool of five trials, observational experiments are artificially built by replacing a control arm from one trial by an arm originating from another trial and containing similarly-treated patients. RESULTS: Among the different statistical approaches, numerical simulations show that DDML has the smallest bias followed by G-computation. In terms of mean squared error, G-computation usually minimizes mean squared error. Compared to other methods, DDML has varying Mean Squared Error performances that improves with increasing sample sizes. For hypothesis testing, all methods control type I error and DDML is the most conservative. G-computation is the best method in terms of statistical power, and DDML has comparable power at [Formula: see text] but inferior ones for smaller sample sizes. The replication procedure also indicates that G-computation minimizes mean squared error whereas DDML has intermediate performances in between G-computation and propensity score approaches. The confidence intervals of G-computation are the narrowest whereas confidence intervals obtained with DDML are the widest for small sample sizes, which confirms its conservative nature. CONCLUSIONS: For external control arm analyses, methods based on outcome prediction models can reduce estimation error and increase statistical power compared to propensity score approaches. |
format | Online Article Text |
id | pubmed-9795588 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-97955882022-12-29 External control arm analysis: an evaluation of propensity score approaches, G-computation, and doubly debiased machine learning Loiseau, Nicolas Trichelair, Paul He, Maxime Andreux, Mathieu Zaslavskiy, Mikhail Wainrib, Gilles Blum, Michael G. B. BMC Med Res Methodol Research Article BACKGROUND: An external control arm is a cohort of control patients that are collected from data external to a single-arm trial. To provide an unbiased estimation of efficacy, the clinical profiles of patients from single and external arms should be aligned, typically using propensity score approaches. There are alternative approaches to infer efficacy based on comparisons between outcomes of single-arm patients and machine-learning predictions of control patient outcomes. These methods include G-computation and Doubly Debiased Machine Learning (DDML) and their evaluation for External Control Arms (ECA) analysis is insufficient. METHODS: We consider both numerical simulations and a trial replication procedure to evaluate the different statistical approaches: propensity score matching, Inverse Probability of Treatment Weighting (IPTW), G-computation, and DDML. The replication study relies on five type 2 diabetes randomized clinical trials granted by the Yale University Open Data Access (YODA) project. From the pool of five trials, observational experiments are artificially built by replacing a control arm from one trial by an arm originating from another trial and containing similarly-treated patients. RESULTS: Among the different statistical approaches, numerical simulations show that DDML has the smallest bias followed by G-computation. In terms of mean squared error, G-computation usually minimizes mean squared error. Compared to other methods, DDML has varying Mean Squared Error performances that improves with increasing sample sizes. For hypothesis testing, all methods control type I error and DDML is the most conservative. G-computation is the best method in terms of statistical power, and DDML has comparable power at [Formula: see text] but inferior ones for smaller sample sizes. The replication procedure also indicates that G-computation minimizes mean squared error whereas DDML has intermediate performances in between G-computation and propensity score approaches. The confidence intervals of G-computation are the narrowest whereas confidence intervals obtained with DDML are the widest for small sample sizes, which confirms its conservative nature. CONCLUSIONS: For external control arm analyses, methods based on outcome prediction models can reduce estimation error and increase statistical power compared to propensity score approaches. BioMed Central 2022-12-28 /pmc/articles/PMC9795588/ /pubmed/36577946 http://dx.doi.org/10.1186/s12874-022-01799-z Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Article Loiseau, Nicolas Trichelair, Paul He, Maxime Andreux, Mathieu Zaslavskiy, Mikhail Wainrib, Gilles Blum, Michael G. B. External control arm analysis: an evaluation of propensity score approaches, G-computation, and doubly debiased machine learning |
title | External control arm analysis: an evaluation of propensity score approaches, G-computation, and doubly debiased machine learning |
title_full | External control arm analysis: an evaluation of propensity score approaches, G-computation, and doubly debiased machine learning |
title_fullStr | External control arm analysis: an evaluation of propensity score approaches, G-computation, and doubly debiased machine learning |
title_full_unstemmed | External control arm analysis: an evaluation of propensity score approaches, G-computation, and doubly debiased machine learning |
title_short | External control arm analysis: an evaluation of propensity score approaches, G-computation, and doubly debiased machine learning |
title_sort | external control arm analysis: an evaluation of propensity score approaches, g-computation, and doubly debiased machine learning |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9795588/ https://www.ncbi.nlm.nih.gov/pubmed/36577946 http://dx.doi.org/10.1186/s12874-022-01799-z |
work_keys_str_mv | AT loiseaunicolas externalcontrolarmanalysisanevaluationofpropensityscoreapproachesgcomputationanddoublydebiasedmachinelearning AT trichelairpaul externalcontrolarmanalysisanevaluationofpropensityscoreapproachesgcomputationanddoublydebiasedmachinelearning AT hemaxime externalcontrolarmanalysisanevaluationofpropensityscoreapproachesgcomputationanddoublydebiasedmachinelearning AT andreuxmathieu externalcontrolarmanalysisanevaluationofpropensityscoreapproachesgcomputationanddoublydebiasedmachinelearning AT zaslavskiymikhail externalcontrolarmanalysisanevaluationofpropensityscoreapproachesgcomputationanddoublydebiasedmachinelearning AT wainribgilles externalcontrolarmanalysisanevaluationofpropensityscoreapproachesgcomputationanddoublydebiasedmachinelearning AT blummichaelgb externalcontrolarmanalysisanevaluationofpropensityscoreapproachesgcomputationanddoublydebiasedmachinelearning |