Cargando…
AdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization
In this work, we introduce AdaCN, a novel adaptive cubic Newton method for nonconvex stochastic optimization. AdaCN dynamically captures the curvature of the loss landscape by diagonally approximated Hessian plus the norm of difference between previous two estimates. It only requires at most first o...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8598341/ https://www.ncbi.nlm.nih.gov/pubmed/34804146 http://dx.doi.org/10.1155/2021/5790608 |
_version_ | 1784600804053745664 |
---|---|
author | Liu, Yan Zhang, Maojun Zhong, Zhiwei Zeng, Xiangrong |
author_facet | Liu, Yan Zhang, Maojun Zhong, Zhiwei Zeng, Xiangrong |
author_sort | Liu, Yan |
collection | PubMed |
description | In this work, we introduce AdaCN, a novel adaptive cubic Newton method for nonconvex stochastic optimization. AdaCN dynamically captures the curvature of the loss landscape by diagonally approximated Hessian plus the norm of difference between previous two estimates. It only requires at most first order gradients and updates with linear complexity for both time and memory. In order to reduce the variance introduced by the stochastic nature of the problem, AdaCN hires the first and second moment to implement and exponential moving average on iteratively updated stochastic gradients and approximated stochastic Hessians, respectively. We validate AdaCN in extensive experiments, showing that it outperforms other stochastic first order methods (including SGD, Adam, and AdaBound) and stochastic quasi-Newton method (i.e., Apollo), in terms of both convergence speed and generalization performance. |
format | Online Article Text |
id | pubmed-8598341 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Hindawi |
record_format | MEDLINE/PubMed |
spelling | pubmed-85983412021-11-18 AdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization Liu, Yan Zhang, Maojun Zhong, Zhiwei Zeng, Xiangrong Comput Intell Neurosci Research Article In this work, we introduce AdaCN, a novel adaptive cubic Newton method for nonconvex stochastic optimization. AdaCN dynamically captures the curvature of the loss landscape by diagonally approximated Hessian plus the norm of difference between previous two estimates. It only requires at most first order gradients and updates with linear complexity for both time and memory. In order to reduce the variance introduced by the stochastic nature of the problem, AdaCN hires the first and second moment to implement and exponential moving average on iteratively updated stochastic gradients and approximated stochastic Hessians, respectively. We validate AdaCN in extensive experiments, showing that it outperforms other stochastic first order methods (including SGD, Adam, and AdaBound) and stochastic quasi-Newton method (i.e., Apollo), in terms of both convergence speed and generalization performance. Hindawi 2021-11-10 /pmc/articles/PMC8598341/ /pubmed/34804146 http://dx.doi.org/10.1155/2021/5790608 Text en Copyright © 2021 Yan Liu et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Liu, Yan Zhang, Maojun Zhong, Zhiwei Zeng, Xiangrong AdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization |
title | AdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization |
title_full | AdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization |
title_fullStr | AdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization |
title_full_unstemmed | AdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization |
title_short | AdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization |
title_sort | adacn: an adaptive cubic newton method for nonconvex stochastic optimization |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8598341/ https://www.ncbi.nlm.nih.gov/pubmed/34804146 http://dx.doi.org/10.1155/2021/5790608 |
work_keys_str_mv | AT liuyan adacnanadaptivecubicnewtonmethodfornonconvexstochasticoptimization AT zhangmaojun adacnanadaptivecubicnewtonmethodfornonconvexstochasticoptimization AT zhongzhiwei adacnanadaptivecubicnewtonmethodfornonconvexstochasticoptimization AT zengxiangrong adacnanadaptivecubicnewtonmethodfornonconvexstochasticoptimization |