Cargando…
Adaptive divergence for rapid adversarial optimization
Adversarial Optimization provides a reliable, practical way to match two implicitly defined distributions, one of which is typically represented by a sample of real data, and the other is represented by a parameterized generator. Matching of the distributions is achieved by minimizing a divergence b...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
PeerJ Inc.
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7924553/ https://www.ncbi.nlm.nih.gov/pubmed/33816925 http://dx.doi.org/10.7717/peerj-cs.274 |
_version_ | 1783659113102704640 |
---|---|
author | Borisyak, Maxim Gaintseva, Tatiana Ustyuzhanin, Andrey |
author_facet | Borisyak, Maxim Gaintseva, Tatiana Ustyuzhanin, Andrey |
author_sort | Borisyak, Maxim |
collection | PubMed |
description | Adversarial Optimization provides a reliable, practical way to match two implicitly defined distributions, one of which is typically represented by a sample of real data, and the other is represented by a parameterized generator. Matching of the distributions is achieved by minimizing a divergence between these distribution, and estimation of the divergence involves a secondary optimization task, which, typically, requires training a model to discriminate between these distributions. The choice of the model has its trade-off: high-capacity models provide good estimations of the divergence, but, generally, require large sample sizes to be properly trained. In contrast, low-capacity models tend to require fewer samples for training; however, they might provide biased estimations. Computational costs of Adversarial Optimization becomes significant when sampling from the generator is expensive. One of the practical examples of such settings is fine-tuning parameters of complex computer simulations. In this work, we introduce a novel family of divergences that enables faster optimization convergence measured by the number of samples drawn from the generator. The variation of the underlying discriminator model capacity during optimization leads to a significant speed-up. The proposed divergence family suggests using low-capacity models to compare distant distributions (typically, at early optimization steps), and the capacity gradually grows as the distributions become closer to each other. Thus, it allows for a significant acceleration of the initial stages of optimization. This acceleration was demonstrated on two fine-tuning problems involving Pythia event generator and two of the most popular black-box optimization algorithms: Bayesian Optimization and Variational Optimization. Experiments show that, given the same budget, adaptive divergences yield results up to an order of magnitude closer to the optimum than Jensen-Shannon divergence. While we consider physics-related simulations, adaptive divergences can be applied to any stochastic simulation. |
format | Online Article Text |
id | pubmed-7924553 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | PeerJ Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-79245532021-04-02 Adaptive divergence for rapid adversarial optimization Borisyak, Maxim Gaintseva, Tatiana Ustyuzhanin, Andrey PeerJ Comput Sci Data Mining and Machine Learning Adversarial Optimization provides a reliable, practical way to match two implicitly defined distributions, one of which is typically represented by a sample of real data, and the other is represented by a parameterized generator. Matching of the distributions is achieved by minimizing a divergence between these distribution, and estimation of the divergence involves a secondary optimization task, which, typically, requires training a model to discriminate between these distributions. The choice of the model has its trade-off: high-capacity models provide good estimations of the divergence, but, generally, require large sample sizes to be properly trained. In contrast, low-capacity models tend to require fewer samples for training; however, they might provide biased estimations. Computational costs of Adversarial Optimization becomes significant when sampling from the generator is expensive. One of the practical examples of such settings is fine-tuning parameters of complex computer simulations. In this work, we introduce a novel family of divergences that enables faster optimization convergence measured by the number of samples drawn from the generator. The variation of the underlying discriminator model capacity during optimization leads to a significant speed-up. The proposed divergence family suggests using low-capacity models to compare distant distributions (typically, at early optimization steps), and the capacity gradually grows as the distributions become closer to each other. Thus, it allows for a significant acceleration of the initial stages of optimization. This acceleration was demonstrated on two fine-tuning problems involving Pythia event generator and two of the most popular black-box optimization algorithms: Bayesian Optimization and Variational Optimization. Experiments show that, given the same budget, adaptive divergences yield results up to an order of magnitude closer to the optimum than Jensen-Shannon divergence. While we consider physics-related simulations, adaptive divergences can be applied to any stochastic simulation. PeerJ Inc. 2020-05-18 /pmc/articles/PMC7924553/ /pubmed/33816925 http://dx.doi.org/10.7717/peerj-cs.274 Text en ©2020 Borisyak et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited. |
spellingShingle | Data Mining and Machine Learning Borisyak, Maxim Gaintseva, Tatiana Ustyuzhanin, Andrey Adaptive divergence for rapid adversarial optimization |
title | Adaptive divergence for rapid adversarial optimization |
title_full | Adaptive divergence for rapid adversarial optimization |
title_fullStr | Adaptive divergence for rapid adversarial optimization |
title_full_unstemmed | Adaptive divergence for rapid adversarial optimization |
title_short | Adaptive divergence for rapid adversarial optimization |
title_sort | adaptive divergence for rapid adversarial optimization |
topic | Data Mining and Machine Learning |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7924553/ https://www.ncbi.nlm.nih.gov/pubmed/33816925 http://dx.doi.org/10.7717/peerj-cs.274 |
work_keys_str_mv | AT borisyakmaxim adaptivedivergenceforrapidadversarialoptimization AT gaintsevatatiana adaptivedivergenceforrapidadversarialoptimization AT ustyuzhaninandrey adaptivedivergenceforrapidadversarialoptimization |