Cargando…
Optimal adaptive allocation using deep reinforcement learning in a dose‐response study
Estimation of the dose‐response curve for efficacy and subsequent selection of an appropriate dose in phase II trials are important processes in drug development. Various methods have been investigated to estimate dose‐response curves. Generally, these methods are used with equal allocation of subje...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
John Wiley and Sons Inc.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9298337/ https://www.ncbi.nlm.nih.gov/pubmed/34747043 http://dx.doi.org/10.1002/sim.9247 |
_version_ | 1784750684029059072 |
---|---|
author | Matsuura, Kentaro Honda, Junya El Hanafi, Imad Sozu, Takashi Sakamaki, Kentaro |
author_facet | Matsuura, Kentaro Honda, Junya El Hanafi, Imad Sozu, Takashi Sakamaki, Kentaro |
author_sort | Matsuura, Kentaro |
collection | PubMed |
description | Estimation of the dose‐response curve for efficacy and subsequent selection of an appropriate dose in phase II trials are important processes in drug development. Various methods have been investigated to estimate dose‐response curves. Generally, these methods are used with equal allocation of subjects for simplicity; nevertheless, they may not fully optimize performance metrics because of nonoptimal allocation. Optimal allocation methods, which include adaptive allocation methods, have been proposed to overcome the limitations of equal allocation. However, they rely on asymptotics, and thus sometimes cannot efficiently optimize the performance metric with the sample size in an actual clinical trial. The purpose of this study is to construct an adaptive allocation rule that directly optimizes a performance metric, such as power, accuracy of model selection, accuracy of the estimated target dose, or mean absolute error over the estimated dose‐response curve. We demonstrate that deep reinforcement learning with an appropriately defined state and reward can be used to construct such an adaptive allocation rule. The simulation study shows that the proposed method can successfully improve the performance metric to be optimized when compared with the equal allocation, D‐optimal, and TD‐optimal methods. In particular, when the mean absolute error was set to the metric to be optimized, it is possible to construct a rule that is superior for many metrics. |
format | Online Article Text |
id | pubmed-9298337 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | John Wiley and Sons Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-92983372022-07-21 Optimal adaptive allocation using deep reinforcement learning in a dose‐response study Matsuura, Kentaro Honda, Junya El Hanafi, Imad Sozu, Takashi Sakamaki, Kentaro Stat Med Research Articles Estimation of the dose‐response curve for efficacy and subsequent selection of an appropriate dose in phase II trials are important processes in drug development. Various methods have been investigated to estimate dose‐response curves. Generally, these methods are used with equal allocation of subjects for simplicity; nevertheless, they may not fully optimize performance metrics because of nonoptimal allocation. Optimal allocation methods, which include adaptive allocation methods, have been proposed to overcome the limitations of equal allocation. However, they rely on asymptotics, and thus sometimes cannot efficiently optimize the performance metric with the sample size in an actual clinical trial. The purpose of this study is to construct an adaptive allocation rule that directly optimizes a performance metric, such as power, accuracy of model selection, accuracy of the estimated target dose, or mean absolute error over the estimated dose‐response curve. We demonstrate that deep reinforcement learning with an appropriately defined state and reward can be used to construct such an adaptive allocation rule. The simulation study shows that the proposed method can successfully improve the performance metric to be optimized when compared with the equal allocation, D‐optimal, and TD‐optimal methods. In particular, when the mean absolute error was set to the metric to be optimized, it is possible to construct a rule that is superior for many metrics. John Wiley and Sons Inc. 2021-11-07 2022-03-30 /pmc/articles/PMC9298337/ /pubmed/34747043 http://dx.doi.org/10.1002/sim.9247 Text en © 2021 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. https://creativecommons.org/licenses/by/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Articles Matsuura, Kentaro Honda, Junya El Hanafi, Imad Sozu, Takashi Sakamaki, Kentaro Optimal adaptive allocation using deep reinforcement learning in a dose‐response study |
title | Optimal adaptive allocation using deep reinforcement learning in a dose‐response study |
title_full | Optimal adaptive allocation using deep reinforcement learning in a dose‐response study |
title_fullStr | Optimal adaptive allocation using deep reinforcement learning in a dose‐response study |
title_full_unstemmed | Optimal adaptive allocation using deep reinforcement learning in a dose‐response study |
title_short | Optimal adaptive allocation using deep reinforcement learning in a dose‐response study |
title_sort | optimal adaptive allocation using deep reinforcement learning in a dose‐response study |
topic | Research Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9298337/ https://www.ncbi.nlm.nih.gov/pubmed/34747043 http://dx.doi.org/10.1002/sim.9247 |
work_keys_str_mv | AT matsuurakentaro optimaladaptiveallocationusingdeepreinforcementlearninginadoseresponsestudy AT hondajunya optimaladaptiveallocationusingdeepreinforcementlearninginadoseresponsestudy AT elhanafiimad optimaladaptiveallocationusingdeepreinforcementlearninginadoseresponsestudy AT sozutakashi optimaladaptiveallocationusingdeepreinforcementlearninginadoseresponsestudy AT sakamakikentaro optimaladaptiveallocationusingdeepreinforcementlearninginadoseresponsestudy |