Cargando…

Adaptive hyperparameter updating for training restricted Boltzmann machines on quantum annealers

Restricted Boltzmann Machines (RBMs) have been proposed for developing neural networks for a variety of unsupervised machine learning applications such as image recognition, drug discovery, and materials design. The Boltzmann probability distribution is used as a model to identify network parameters...

Descripción completa

Detalles Bibliográficos
Autores principales: Xu, Guanglei, Oates, William S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7851404/
https://www.ncbi.nlm.nih.gov/pubmed/33526868
http://dx.doi.org/10.1038/s41598-021-82197-1
_version_ 1783645620668465152
author Xu, Guanglei
Oates, William S.
author_facet Xu, Guanglei
Oates, William S.
author_sort Xu, Guanglei
collection PubMed
description Restricted Boltzmann Machines (RBMs) have been proposed for developing neural networks for a variety of unsupervised machine learning applications such as image recognition, drug discovery, and materials design. The Boltzmann probability distribution is used as a model to identify network parameters by optimizing the likelihood of predicting an output given hidden states trained on available data. Training such networks often requires sampling over a large probability space that must be approximated during gradient based optimization. Quantum annealing has been proposed as a means to search this space more efficiently which has been experimentally investigated on D-Wave hardware. D-Wave implementation requires selection of an effective inverse temperature or hyperparameter ([Formula: see text] ) within the Boltzmann distribution which can strongly influence optimization. Here, we show how this parameter can be estimated as a hyperparameter applied to D-Wave hardware during neural network training by maximizing the likelihood or minimizing the Shannon entropy. We find both methods improve training RBMs based upon D-Wave hardware experimental validation on an image recognition problem. Neural network image reconstruction errors are evaluated using Bayesian uncertainty analysis which illustrate more than an order magnitude lower image reconstruction error using the maximum likelihood over manually optimizing the hyperparameter. The maximum likelihood method is also shown to out-perform minimizing the Shannon entropy for image reconstruction.
format Online
Article
Text
id pubmed-7851404
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-78514042021-02-03 Adaptive hyperparameter updating for training restricted Boltzmann machines on quantum annealers Xu, Guanglei Oates, William S. Sci Rep Article Restricted Boltzmann Machines (RBMs) have been proposed for developing neural networks for a variety of unsupervised machine learning applications such as image recognition, drug discovery, and materials design. The Boltzmann probability distribution is used as a model to identify network parameters by optimizing the likelihood of predicting an output given hidden states trained on available data. Training such networks often requires sampling over a large probability space that must be approximated during gradient based optimization. Quantum annealing has been proposed as a means to search this space more efficiently which has been experimentally investigated on D-Wave hardware. D-Wave implementation requires selection of an effective inverse temperature or hyperparameter ([Formula: see text] ) within the Boltzmann distribution which can strongly influence optimization. Here, we show how this parameter can be estimated as a hyperparameter applied to D-Wave hardware during neural network training by maximizing the likelihood or minimizing the Shannon entropy. We find both methods improve training RBMs based upon D-Wave hardware experimental validation on an image recognition problem. Neural network image reconstruction errors are evaluated using Bayesian uncertainty analysis which illustrate more than an order magnitude lower image reconstruction error using the maximum likelihood over manually optimizing the hyperparameter. The maximum likelihood method is also shown to out-perform minimizing the Shannon entropy for image reconstruction. Nature Publishing Group UK 2021-02-01 /pmc/articles/PMC7851404/ /pubmed/33526868 http://dx.doi.org/10.1038/s41598-021-82197-1 Text en © The Author(s) 2021 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Xu, Guanglei
Oates, William S.
Adaptive hyperparameter updating for training restricted Boltzmann machines on quantum annealers
title Adaptive hyperparameter updating for training restricted Boltzmann machines on quantum annealers
title_full Adaptive hyperparameter updating for training restricted Boltzmann machines on quantum annealers
title_fullStr Adaptive hyperparameter updating for training restricted Boltzmann machines on quantum annealers
title_full_unstemmed Adaptive hyperparameter updating for training restricted Boltzmann machines on quantum annealers
title_short Adaptive hyperparameter updating for training restricted Boltzmann machines on quantum annealers
title_sort adaptive hyperparameter updating for training restricted boltzmann machines on quantum annealers
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7851404/
https://www.ncbi.nlm.nih.gov/pubmed/33526868
http://dx.doi.org/10.1038/s41598-021-82197-1
work_keys_str_mv AT xuguanglei adaptivehyperparameterupdatingfortrainingrestrictedboltzmannmachinesonquantumannealers
AT oateswilliams adaptivehyperparameterupdatingfortrainingrestrictedboltzmannmachinesonquantumannealers