Convergence analysis of AdaBound with relaxed bound functions for non-convex optimization.

Researchers

Dongpo Xu Jinlan Liu Jun Kong Miao Qi Yinghua Lu

Journal

Neural networks : the official journal of the International Neural Network Society

Modalities

Models

Abstract

Clipping on learning rates in Adam leads to an effective stochastic algorithm-AdaBound. In spite of its effectiveness in practice, convergence analysis of AdaBound has not been fully explored, especially for non-convex optimization. To this end, we address the convergence of the last individual output of AdaBound for non-convex stochastic optimization problems, which is called individual convergence. We prove that, with the iteration of the AdaBound, the cost function converges to a finite value and the corresponding gradient converges to zero. The novelty of this proof is that the convergence conditions on the bound functions and momentum factors are much more relaxed than the existing results, especially when we remove the monotonicity and convergence of the bound functions, and only keep their boundedness. The momentum factors can be fixed to be constant, without the restriction of monotonically decreasing. This provides a new perspective on understanding the bound functions and momentum factors of AdaBound. At last, numerical experiments are provided to corroborate our theory and show that the convergence of AdaBound extends to more general bound functions.Copyright © 2021 Elsevier Ltd. All rights reserved.

Show Full Text

Convergence analysis of AdaBound with relaxed bound functions for non-convex optimization.

Researchers

Journal

Modalities

Models

Abstract

Automated high-speed 3D imaging of organoid cultures with multi-scale phenotypic quantification.

Deep learning at the edge enables real-time streaming ptychographic imaging.

Personalized Deep Bi-LSTM RNN Based Model for Pain Intensity Classification Using EDA Signal.

An AI-based approach for detecting cells and microbial byproducts in low volume scanning electron microscope images of biofilms.

Application of structured support vector machine backpropagation to a convolutional neural network for human pose estimation.

Multivariate Time Series Forecasting Using Multiscale Recurrent Networks With Scale Attention and Cross-Scale Guidance.

Leave a Reply Cancel reply

Researchers

Journal

Modalities

Models

Abstract

Similar Posts

Leave a Reply Cancel reply