Cargando…

Adaptive divergence for rapid adversarial optimization

Adversarial Optimization provides a reliable, practical way to match two implicitly defined distributions, one of which is typically represented by a sample of real data, and the other is represented by a parameterized generator. Matching of the distributions is achieved by minimizing a divergence b...

Descripción completa

Detalles Bibliográficos
Autores principales:	Borisyak, Maxim, Gaintseva, Tatiana, Ustyuzhanin, Andrey
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	PeerJ Inc. 2020
Materias:	Data Mining and Machine Learning
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7924553/ https://www.ncbi.nlm.nih.gov/pubmed/33816925 http://dx.doi.org/10.7717/peerj-cs.274

_version_	1783659113102704640
author	Borisyak, Maxim Gaintseva, Tatiana Ustyuzhanin, Andrey
author_facet	Borisyak, Maxim Gaintseva, Tatiana Ustyuzhanin, Andrey
author_sort	Borisyak, Maxim
collection	PubMed
description	Adversarial Optimization provides a reliable, practical way to match two implicitly defined distributions, one of which is typically represented by a sample of real data, and the other is represented by a parameterized generator. Matching of the distributions is achieved by minimizing a divergence between these distribution, and estimation of the divergence involves a secondary optimization task, which, typically, requires training a model to discriminate between these distributions. The choice of the model has its trade-off: high-capacity models provide good estimations of the divergence, but, generally, require large sample sizes to be properly trained. In contrast, low-capacity models tend to require fewer samples for training; however, they might provide biased estimations. Computational costs of Adversarial Optimization becomes significant when sampling from the generator is expensive. One of the practical examples of such settings is fine-tuning parameters of complex computer simulations. In this work, we introduce a novel family of divergences that enables faster optimization convergence measured by the number of samples drawn from the generator. The variation of the underlying discriminator model capacity during optimization leads to a significant speed-up. The proposed divergence family suggests using low-capacity models to compare distant distributions (typically, at early optimization steps), and the capacity gradually grows as the distributions become closer to each other. Thus, it allows for a significant acceleration of the initial stages of optimization. This acceleration was demonstrated on two fine-tuning problems involving Pythia event generator and two of the most popular black-box optimization algorithms: Bayesian Optimization and Variational Optimization. Experiments show that, given the same budget, adaptive divergences yield results up to an order of magnitude closer to the optimum than Jensen-Shannon divergence. While we consider physics-related simulations, adaptive divergences can be applied to any stochastic simulation.
format	Online Article Text
id	pubmed-7924553
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	PeerJ Inc.
record_format	MEDLINE/PubMed
spelling	pubmed-79245532021-04-02 Adaptive divergence for rapid adversarial optimization Borisyak, Maxim Gaintseva, Tatiana Ustyuzhanin, Andrey PeerJ Comput Sci Data Mining and Machine Learning Adversarial Optimization provides a reliable, practical way to match two implicitly defined distributions, one of which is typically represented by a sample of real data, and the other is represented by a parameterized generator. Matching of the distributions is achieved by minimizing a divergence between these distribution, and estimation of the divergence involves a secondary optimization task, which, typically, requires training a model to discriminate between these distributions. The choice of the model has its trade-off: high-capacity models provide good estimations of the divergence, but, generally, require large sample sizes to be properly trained. In contrast, low-capacity models tend to require fewer samples for training; however, they might provide biased estimations. Computational costs of Adversarial Optimization becomes significant when sampling from the generator is expensive. One of the practical examples of such settings is fine-tuning parameters of complex computer simulations. In this work, we introduce a novel family of divergences that enables faster optimization convergence measured by the number of samples drawn from the generator. The variation of the underlying discriminator model capacity during optimization leads to a significant speed-up. The proposed divergence family suggests using low-capacity models to compare distant distributions (typically, at early optimization steps), and the capacity gradually grows as the distributions become closer to each other. Thus, it allows for a significant acceleration of the initial stages of optimization. This acceleration was demonstrated on two fine-tuning problems involving Pythia event generator and two of the most popular black-box optimization algorithms: Bayesian Optimization and Variational Optimization. Experiments show that, given the same budget, adaptive divergences yield results up to an order of magnitude closer to the optimum than Jensen-Shannon divergence. While we consider physics-related simulations, adaptive divergences can be applied to any stochastic simulation. PeerJ Inc. 2020-05-18 /pmc/articles/PMC7924553/ /pubmed/33816925 http://dx.doi.org/10.7717/peerj-cs.274 Text en ©2020 Borisyak et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle	Data Mining and Machine Learning Borisyak, Maxim Gaintseva, Tatiana Ustyuzhanin, Andrey Adaptive divergence for rapid adversarial optimization
title	Adaptive divergence for rapid adversarial optimization
title_full	Adaptive divergence for rapid adversarial optimization
title_fullStr	Adaptive divergence for rapid adversarial optimization
title_full_unstemmed	Adaptive divergence for rapid adversarial optimization
title_short	Adaptive divergence for rapid adversarial optimization
title_sort	adaptive divergence for rapid adversarial optimization
topic	Data Mining and Machine Learning
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7924553/ https://www.ncbi.nlm.nih.gov/pubmed/33816925 http://dx.doi.org/10.7717/peerj-cs.274
work_keys_str_mv	AT borisyakmaxim adaptivedivergenceforrapidadversarialoptimization AT gaintsevatatiana adaptivedivergenceforrapidadversarialoptimization AT ustyuzhaninandrey adaptivedivergenceforrapidadversarialoptimization

Adaptive divergence for rapid adversarial optimization

Ejemplares similares