A Stochastic Gradient Descent method under the Machine Learning family, originally introduced in https://arxiv.org/abs/1412.6980, with:
Parameters
- is the stepsize
- is the exponential decay rates for the momentum estimates
- is the parameter vector to be optimized.
- is the stochastic objective function
Algorithm
Termination condition
has converged