Cargando…

External control arm analysis: an evaluation of propensity score approaches, G-computation, and doubly debiased machine learning

BACKGROUND: An external control arm is a cohort of control patients that are collected from data external to a single-arm trial. To provide an unbiased estimation of efficacy, the clinical profiles of patients from single and external arms should be aligned, typically using propensity score approach...

Descripción completa

Detalles Bibliográficos
Autores principales:	Loiseau, Nicolas, Trichelair, Paul, He, Maxime, Andreux, Mathieu, Zaslavskiy, Mikhail, Wainrib, Gilles, Blum, Michael G. B.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2022
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9795588/ https://www.ncbi.nlm.nih.gov/pubmed/36577946 http://dx.doi.org/10.1186/s12874-022-01799-z

_version_	1784860294636371968
author	Loiseau, Nicolas Trichelair, Paul He, Maxime Andreux, Mathieu Zaslavskiy, Mikhail Wainrib, Gilles Blum, Michael G. B.
author_facet	Loiseau, Nicolas Trichelair, Paul He, Maxime Andreux, Mathieu Zaslavskiy, Mikhail Wainrib, Gilles Blum, Michael G. B.
author_sort	Loiseau, Nicolas
collection	PubMed
description	BACKGROUND: An external control arm is a cohort of control patients that are collected from data external to a single-arm trial. To provide an unbiased estimation of efficacy, the clinical profiles of patients from single and external arms should be aligned, typically using propensity score approaches. There are alternative approaches to infer efficacy based on comparisons between outcomes of single-arm patients and machine-learning predictions of control patient outcomes. These methods include G-computation and Doubly Debiased Machine Learning (DDML) and their evaluation for External Control Arms (ECA) analysis is insufficient. METHODS: We consider both numerical simulations and a trial replication procedure to evaluate the different statistical approaches: propensity score matching, Inverse Probability of Treatment Weighting (IPTW), G-computation, and DDML. The replication study relies on five type 2 diabetes randomized clinical trials granted by the Yale University Open Data Access (YODA) project. From the pool of five trials, observational experiments are artificially built by replacing a control arm from one trial by an arm originating from another trial and containing similarly-treated patients. RESULTS: Among the different statistical approaches, numerical simulations show that DDML has the smallest bias followed by G-computation. In terms of mean squared error, G-computation usually minimizes mean squared error. Compared to other methods, DDML has varying Mean Squared Error performances that improves with increasing sample sizes. For hypothesis testing, all methods control type I error and DDML is the most conservative. G-computation is the best method in terms of statistical power, and DDML has comparable power at [Formula: see text] but inferior ones for smaller sample sizes. The replication procedure also indicates that G-computation minimizes mean squared error whereas DDML has intermediate performances in between G-computation and propensity score approaches. The confidence intervals of G-computation are the narrowest whereas confidence intervals obtained with DDML are the widest for small sample sizes, which confirms its conservative nature. CONCLUSIONS: For external control arm analyses, methods based on outcome prediction models can reduce estimation error and increase statistical power compared to propensity score approaches.
format	Online Article Text
id	pubmed-9795588
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-97955882022-12-29 External control arm analysis: an evaluation of propensity score approaches, G-computation, and doubly debiased machine learning Loiseau, Nicolas Trichelair, Paul He, Maxime Andreux, Mathieu Zaslavskiy, Mikhail Wainrib, Gilles Blum, Michael G. B. BMC Med Res Methodol Research Article BACKGROUND: An external control arm is a cohort of control patients that are collected from data external to a single-arm trial. To provide an unbiased estimation of efficacy, the clinical profiles of patients from single and external arms should be aligned, typically using propensity score approaches. There are alternative approaches to infer efficacy based on comparisons between outcomes of single-arm patients and machine-learning predictions of control patient outcomes. These methods include G-computation and Doubly Debiased Machine Learning (DDML) and their evaluation for External Control Arms (ECA) analysis is insufficient. METHODS: We consider both numerical simulations and a trial replication procedure to evaluate the different statistical approaches: propensity score matching, Inverse Probability of Treatment Weighting (IPTW), G-computation, and DDML. The replication study relies on five type 2 diabetes randomized clinical trials granted by the Yale University Open Data Access (YODA) project. From the pool of five trials, observational experiments are artificially built by replacing a control arm from one trial by an arm originating from another trial and containing similarly-treated patients. RESULTS: Among the different statistical approaches, numerical simulations show that DDML has the smallest bias followed by G-computation. In terms of mean squared error, G-computation usually minimizes mean squared error. Compared to other methods, DDML has varying Mean Squared Error performances that improves with increasing sample sizes. For hypothesis testing, all methods control type I error and DDML is the most conservative. G-computation is the best method in terms of statistical power, and DDML has comparable power at [Formula: see text] but inferior ones for smaller sample sizes. The replication procedure also indicates that G-computation minimizes mean squared error whereas DDML has intermediate performances in between G-computation and propensity score approaches. The confidence intervals of G-computation are the narrowest whereas confidence intervals obtained with DDML are the widest for small sample sizes, which confirms its conservative nature. CONCLUSIONS: For external control arm analyses, methods based on outcome prediction models can reduce estimation error and increase statistical power compared to propensity score approaches. BioMed Central 2022-12-28 /pmc/articles/PMC9795588/ /pubmed/36577946 http://dx.doi.org/10.1186/s12874-022-01799-z Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle	Research Article Loiseau, Nicolas Trichelair, Paul He, Maxime Andreux, Mathieu Zaslavskiy, Mikhail Wainrib, Gilles Blum, Michael G. B. External control arm analysis: an evaluation of propensity score approaches, G-computation, and doubly debiased machine learning
title	External control arm analysis: an evaluation of propensity score approaches, G-computation, and doubly debiased machine learning
title_full	External control arm analysis: an evaluation of propensity score approaches, G-computation, and doubly debiased machine learning
title_fullStr	External control arm analysis: an evaluation of propensity score approaches, G-computation, and doubly debiased machine learning
title_full_unstemmed	External control arm analysis: an evaluation of propensity score approaches, G-computation, and doubly debiased machine learning
title_short	External control arm analysis: an evaluation of propensity score approaches, G-computation, and doubly debiased machine learning
title_sort	external control arm analysis: an evaluation of propensity score approaches, g-computation, and doubly debiased machine learning
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9795588/ https://www.ncbi.nlm.nih.gov/pubmed/36577946 http://dx.doi.org/10.1186/s12874-022-01799-z
work_keys_str_mv	AT loiseaunicolas externalcontrolarmanalysisanevaluationofpropensityscoreapproachesgcomputationanddoublydebiasedmachinelearning AT trichelairpaul externalcontrolarmanalysisanevaluationofpropensityscoreapproachesgcomputationanddoublydebiasedmachinelearning AT hemaxime externalcontrolarmanalysisanevaluationofpropensityscoreapproachesgcomputationanddoublydebiasedmachinelearning AT andreuxmathieu externalcontrolarmanalysisanevaluationofpropensityscoreapproachesgcomputationanddoublydebiasedmachinelearning AT zaslavskiymikhail externalcontrolarmanalysisanevaluationofpropensityscoreapproachesgcomputationanddoublydebiasedmachinelearning AT wainribgilles externalcontrolarmanalysisanevaluationofpropensityscoreapproachesgcomputationanddoublydebiasedmachinelearning AT blummichaelgb externalcontrolarmanalysisanevaluationofpropensityscoreapproachesgcomputationanddoublydebiasedmachinelearning

External control arm analysis: an evaluation of propensity score approaches, G-computation, and doubly debiased machine learning

Ejemplares similares