| |

Efficient gradient computation for optimization of hyperparameters.

Researchers

Journal

Modalities

Models

Abstract

We are interested in learning the hyperparameters in a convex objective function in a supervised setting. The complex relationship between the input data to the convex problem and the desirable hyperparameters can be modeled by a neural network; the hyperparameters and the data then drive the convex minimization problem, whose solution is then compared to training labels. In our previous work [1], we evaluated a prototype of this learning strategy in an optimization-based sinogram smoothing plus FBP reconstruction framework. A question arising in this setting is how to efficiently compute (backpropagate) the gradient from the solution of the optimization problem, to the hyperparameters to enable end-to-end training. In this work, we first develop general formulas for gradient backpropagation for a subset of convex problems, namely the proximal mapping. To illustrate the value of the general formulas and to demonstrate how to use them, we consider the specific instance of 1-D quadratic smoothing (denoising) whose solution admits a dynamic programming (DP) algorithm. The general formulas lead to another DP algorithm for exact computation of the gradient of the hyperparameters. Our numerical studies demonstrate a 55%- 65% computation time savings by providing a custom gradient instead of relying on automatic differentiation in deep learning libraries. While our discussion focuses on 1-D quadratic smoothing, our initial results (not presented) support the statement that the general formulas and the computational strategy apply equally well to TV or Huber smoothing problems on simple graphs whose solutions can be computed exactly via DP.© 2021 Institute of Physics and Engineering in Medicine.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *