A Stochastic Gradient Descent method under the Machine Learning family, originally introduced in https://arxiv.org/abs/1412.6980, with:

Parameters

  • is the stepsize
  • is the exponential decay rates for the momentum estimates
  • is the parameter vector to be optimized.
  • is the stochastic objective function

Algorithm

Termination condition

has converged