Table Of Contents
Table Of Contents

mxnet.optimizer

Weight updating functions.

Optimization methods

AdaDelta([rho, epsilon]) The AdaDelta optimizer.
AdaGrad([eps]) AdaGrad optimizer.
Adam([learning_rate, beta1, beta2, epsilon, …]) The Adam optimizer.
Adamax([learning_rate, beta1, beta2]) The AdaMax optimizer.
DCASGD([momentum, lamda]) The DCASGD optimizer.
FTML([beta1, beta2, epsilon]) The FTML optimizer.
Ftrl([lamda1, learning_rate, beta]) The Ftrl optimizer.
LBSGD([momentum, multi_precision, …]) The Large Batch SGD optimizer with momentum and weight decay.
NAG([momentum]) Nesterov accelerated SGD.
Nadam([learning_rate, beta1, beta2, …]) The Nesterov Adam optimizer.
Optimizer([rescale_grad, param_idx2name, …]) The base class inherited by all optimizers.
RMSProp([learning_rate, gamma1, gamma2, …]) The RMSProp optimizer.
SGD([momentum, lazy_update]) The SGD optimizer with momentum and weight decay.
SGLD(**kwargs) Stochastic Gradient Riemannian Langevin Dynamics.
Signum([learning_rate, momentum, wd_lh]) The Signum optimizer that takes the sign of gradient or momentum.
Test(**kwargs) The Test optimizer
Updater(optimizer) Updater for kvstore.
ccSGD(*args, **kwargs) [DEPRECATED] Same as SGD.

Helper functions

create(name, **kwargs) Instantiates an optimizer with a given name and kwargs.
get_updater(optimizer) Returns a closure of the updater needed for kvstore.
register(klass) Registers a new optimizer.