# NAG¶

class mxnet.optimizer.NAG(momentum=0.0, **kwargs)[source]

Nesterov accelerated SGD.

This optimizer updates each weight by:

state = momentum * state + grad + wd * weight
weight = weight - (lr * (grad + momentum * state))

Parameters: momentum (float, optional) – The momentum value. multi_precision (bool, optional) – Flag to control the internal precision of the optimizer. False results in using the same precision as the weights (default), True makes internal 32-bit copy of the weights and applies gradients in 32-bit precision even if actual weights used in the model have lower precision. Turning this on can improve convergence and accuracy when training with float16.
__init__(momentum=0.0, **kwargs)[source]

