Cargando…

DisSAGD: A Distributed Parameter Update Scheme Based on Variance Reduction

Machine learning models often converge slowly and are unstable due to the significant variance of random data when using a sample estimate gradient in SGD. To increase the speed of convergence and improve stability, a distributed SGD algorithm based on variance reduction, named DisSAGD, is proposed...

Descripción completa

Detalles Bibliográficos
Autores principales:	Pan, Haijie, Zheng, Lirong
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8347539/ https://www.ncbi.nlm.nih.gov/pubmed/34372361 http://dx.doi.org/10.3390/s21155124

_version_	1783735114419666944
author	Pan, Haijie Zheng, Lirong
author_facet	Pan, Haijie Zheng, Lirong
author_sort	Pan, Haijie
collection	PubMed
description	Machine learning models often converge slowly and are unstable due to the significant variance of random data when using a sample estimate gradient in SGD. To increase the speed of convergence and improve stability, a distributed SGD algorithm based on variance reduction, named DisSAGD, is proposed in this study. DisSAGD corrects the gradient estimate for each iteration by using the gradient variance of historical iterations without full gradient computation or additional storage, i.e., it reduces the mean variance of historical gradients in order to reduce the error in updating parameters. We implemented DisSAGD in distributed clusters in order to train a machine learning model by sharing parameters among nodes using an asynchronous communication protocol. We also propose an adaptive learning rate strategy, as well as a sampling strategy, to address the update lag of the overall parameter distribution, which helps to improve the convergence speed when the parameters deviate from the optimal value—when one working node is faster than another, this node will have more time to compute the local gradient and sample more samples for the next iteration. Our experiments demonstrate that DisSAGD significantly reduces waiting times during loop iterations and improves convergence speed when compared to traditional methods, and that our method can achieve speed increases for distributed clusters.
format	Online Article Text
id	pubmed-8347539
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-83475392021-08-08 DisSAGD: A Distributed Parameter Update Scheme Based on Variance Reduction Pan, Haijie Zheng, Lirong Sensors (Basel) Article Machine learning models often converge slowly and are unstable due to the significant variance of random data when using a sample estimate gradient in SGD. To increase the speed of convergence and improve stability, a distributed SGD algorithm based on variance reduction, named DisSAGD, is proposed in this study. DisSAGD corrects the gradient estimate for each iteration by using the gradient variance of historical iterations without full gradient computation or additional storage, i.e., it reduces the mean variance of historical gradients in order to reduce the error in updating parameters. We implemented DisSAGD in distributed clusters in order to train a machine learning model by sharing parameters among nodes using an asynchronous communication protocol. We also propose an adaptive learning rate strategy, as well as a sampling strategy, to address the update lag of the overall parameter distribution, which helps to improve the convergence speed when the parameters deviate from the optimal value—when one working node is faster than another, this node will have more time to compute the local gradient and sample more samples for the next iteration. Our experiments demonstrate that DisSAGD significantly reduces waiting times during loop iterations and improves convergence speed when compared to traditional methods, and that our method can achieve speed increases for distributed clusters. MDPI 2021-07-28 /pmc/articles/PMC8347539/ /pubmed/34372361 http://dx.doi.org/10.3390/s21155124 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Pan, Haijie Zheng, Lirong DisSAGD: A Distributed Parameter Update Scheme Based on Variance Reduction
title	DisSAGD: A Distributed Parameter Update Scheme Based on Variance Reduction
title_full	DisSAGD: A Distributed Parameter Update Scheme Based on Variance Reduction
title_fullStr	DisSAGD: A Distributed Parameter Update Scheme Based on Variance Reduction
title_full_unstemmed	DisSAGD: A Distributed Parameter Update Scheme Based on Variance Reduction
title_short	DisSAGD: A Distributed Parameter Update Scheme Based on Variance Reduction
title_sort	dissagd: a distributed parameter update scheme based on variance reduction
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8347539/ https://www.ncbi.nlm.nih.gov/pubmed/34372361 http://dx.doi.org/10.3390/s21155124
work_keys_str_mv	AT panhaijie dissagdadistributedparameterupdateschemebasedonvariancereduction AT zhenglirong dissagdadistributedparameterupdateschemebasedonvariancereduction

DisSAGD: A Distributed Parameter Update Scheme Based on Variance Reduction

Ejemplares similares