Table Of Contents
Table Of Contents

Optimization

Initialize and update model weights during training

Optimizers

mx.opt.adadelta

Create an AdaDelta optimizer with respective parameters

mx.opt.adagrad

Create an AdaGrad optimizer with respective parameters. AdaGrad optimizer of Duchi et al., 2011,

mx.opt.adam

Create an Adam optimizer with respective parameters. Adam optimizer as described in [King2014]

mx.opt.create

Create an optimizer by name and parameters

mx.opt.get.updater

Get an updater closure that can take list of weight and gradient and return updated list of weight

mx.opt.rmsprop

Create an RMSProp optimizer with respective parameters. Reference: Tieleman T, Hinton G. Lecture 6.5- Divide the gradient by a running average of its recent magnitude[J]. COURSERA: Neural Networks for Machine Learning, 2012, 4(2). The code follows: http://arxiv.org/pdf/1308.0850v5.pdf Eq(38) - Eq(45) by Alex Graves, 2013

mx.opt.sgd

Create an SGD optimizer with respective parameters. Perform SGD with momentum update

Initialization

mx.init.create

Create initialization of argument like arg.array

mx.init.internal.default

Internal default value initialization scheme

mx.init.normal

Create a initializer that initialize the weight with normal(0, sd)

mx.init.uniform

Create a initializer that initialize the weight with uniform [-scale, scale]

mx.init.Xavier

Xavier initializer

mx.model.init.params

Parameter initialization

Learning rate schedule

mx.lr_scheduler.FactorScheduler

Learning rate scheduler. Reduction based on a factor value

mx.lr_scheduler.MultiFactorScheduler

Multifactor learning rate scheduler. Reduction based on a factor value at different steps

Optimizer updates (NDArray)

mx.nd.adam.update

Update function for Adam optimizer. Adam is seen as a generalization of AdaGrad

mx.nd.ftml.update

The FTML optimizer described in FTML - Follow the Moving Leader in Deep Learning, available at http://proceedings.mlr.press/v70/zheng17a/zheng17a.pdf

mx.nd.ftrl.update

Update function for Ftrl optimizer. Referenced from Ad Click Prediction: a View from the Trenches, available at http://dl.acm.org/citation.cfm?id=2488200

mx.nd.mp.sgd.mom.update

Updater function for multi-precision sgd optimizer

mx.nd.mp.sgd.update

Updater function for multi-precision sgd optimizer

mx.nd.rmspropalex.update

Update function for RMSPropAlex optimizer

mx.nd.rmsprop.update

Update function for RMSProp optimizer

mx.nd.sgd.mom.update

Momentum update function for Stochastic Gradient Descent (SGD) optimizer

mx.nd.sgd.update

Update function for Stochastic Gradient Descent (SDG) optimizer

mx.nd.signsgd.update

Update function for SignSGD optimizer

mx.nd.signum.update

SIGN momentUM (Signum) optimizer

Optimizer updates (Symbol)

mx.symbol.adam_update

Update function for Adam optimizer. Adam is seen as a generalization of AdaGrad

mx.symbol.ftml_update

The FTML optimizer described in FTML - Follow the Moving Leader in Deep Learning, available at http://proceedings.mlr.press/v70/zheng17a/zheng17a.pdf

mx.symbol.ftrl_update

Update function for Ftrl optimizer. Referenced from Ad Click Prediction: a View from the Trenches, available at http://dl.acm.org/citation.cfm?id=2488200

mx.symbol.mp_sgd_mom_update

Updater function for multi-precision sgd optimizer

mx.symbol.mp_sgd_update

Updater function for multi-precision sgd optimizer

mx.symbol.sgd_mom_update

Momentum update function for Stochastic Gradient Descent (SGD) optimizer

mx.symbol.sgd_update

Update function for Stochastic Gradient Descent (SDG) optimizer

mx.symbol.signsgd_update

Update function for SignSGD optimizer

mx.symbol.signum_update

SIGN momentUM (Signum) optimizer