Cargando…

Optimal adaptive allocation using deep reinforcement learning in a dose‐response study

Estimation of the dose‐response curve for efficacy and subsequent selection of an appropriate dose in phase II trials are important processes in drug development. Various methods have been investigated to estimate dose‐response curves. Generally, these methods are used with equal allocation of subje...

Descripción completa

Detalles Bibliográficos
Autores principales: Matsuura, Kentaro, Honda, Junya, El Hanafi, Imad, Sozu, Takashi, Sakamaki, Kentaro
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9298337/
https://www.ncbi.nlm.nih.gov/pubmed/34747043
http://dx.doi.org/10.1002/sim.9247
_version_ 1784750684029059072
author Matsuura, Kentaro
Honda, Junya
El Hanafi, Imad
Sozu, Takashi
Sakamaki, Kentaro
author_facet Matsuura, Kentaro
Honda, Junya
El Hanafi, Imad
Sozu, Takashi
Sakamaki, Kentaro
author_sort Matsuura, Kentaro
collection PubMed
description Estimation of the dose‐response curve for efficacy and subsequent selection of an appropriate dose in phase II trials are important processes in drug development. Various methods have been investigated to estimate dose‐response curves. Generally, these methods are used with equal allocation of subjects for simplicity; nevertheless, they may not fully optimize performance metrics because of nonoptimal allocation. Optimal allocation methods, which include adaptive allocation methods, have been proposed to overcome the limitations of equal allocation. However, they rely on asymptotics, and thus sometimes cannot efficiently optimize the performance metric with the sample size in an actual clinical trial. The purpose of this study is to construct an adaptive allocation rule that directly optimizes a performance metric, such as power, accuracy of model selection, accuracy of the estimated target dose, or mean absolute error over the estimated dose‐response curve. We demonstrate that deep reinforcement learning with an appropriately defined state and reward can be used to construct such an adaptive allocation rule. The simulation study shows that the proposed method can successfully improve the performance metric to be optimized when compared with the equal allocation, D‐optimal, and TD‐optimal methods. In particular, when the mean absolute error was set to the metric to be optimized, it is possible to construct a rule that is superior for many metrics.
format Online
Article
Text
id pubmed-9298337
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-92983372022-07-21 Optimal adaptive allocation using deep reinforcement learning in a dose‐response study Matsuura, Kentaro Honda, Junya El Hanafi, Imad Sozu, Takashi Sakamaki, Kentaro Stat Med Research Articles Estimation of the dose‐response curve for efficacy and subsequent selection of an appropriate dose in phase II trials are important processes in drug development. Various methods have been investigated to estimate dose‐response curves. Generally, these methods are used with equal allocation of subjects for simplicity; nevertheless, they may not fully optimize performance metrics because of nonoptimal allocation. Optimal allocation methods, which include adaptive allocation methods, have been proposed to overcome the limitations of equal allocation. However, they rely on asymptotics, and thus sometimes cannot efficiently optimize the performance metric with the sample size in an actual clinical trial. The purpose of this study is to construct an adaptive allocation rule that directly optimizes a performance metric, such as power, accuracy of model selection, accuracy of the estimated target dose, or mean absolute error over the estimated dose‐response curve. We demonstrate that deep reinforcement learning with an appropriately defined state and reward can be used to construct such an adaptive allocation rule. The simulation study shows that the proposed method can successfully improve the performance metric to be optimized when compared with the equal allocation, D‐optimal, and TD‐optimal methods. In particular, when the mean absolute error was set to the metric to be optimized, it is possible to construct a rule that is superior for many metrics. John Wiley and Sons Inc. 2021-11-07 2022-03-30 /pmc/articles/PMC9298337/ /pubmed/34747043 http://dx.doi.org/10.1002/sim.9247 Text en © 2021 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. https://creativecommons.org/licenses/by/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Articles
Matsuura, Kentaro
Honda, Junya
El Hanafi, Imad
Sozu, Takashi
Sakamaki, Kentaro
Optimal adaptive allocation using deep reinforcement learning in a dose‐response study
title Optimal adaptive allocation using deep reinforcement learning in a dose‐response study
title_full Optimal adaptive allocation using deep reinforcement learning in a dose‐response study
title_fullStr Optimal adaptive allocation using deep reinforcement learning in a dose‐response study
title_full_unstemmed Optimal adaptive allocation using deep reinforcement learning in a dose‐response study
title_short Optimal adaptive allocation using deep reinforcement learning in a dose‐response study
title_sort optimal adaptive allocation using deep reinforcement learning in a dose‐response study
topic Research Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9298337/
https://www.ncbi.nlm.nih.gov/pubmed/34747043
http://dx.doi.org/10.1002/sim.9247
work_keys_str_mv AT matsuurakentaro optimaladaptiveallocationusingdeepreinforcementlearninginadoseresponsestudy
AT hondajunya optimaladaptiveallocationusingdeepreinforcementlearninginadoseresponsestudy
AT elhanafiimad optimaladaptiveallocationusingdeepreinforcementlearninginadoseresponsestudy
AT sozutakashi optimaladaptiveallocationusingdeepreinforcementlearninginadoseresponsestudy
AT sakamakikentaro optimaladaptiveallocationusingdeepreinforcementlearninginadoseresponsestudy