Table Of Contents
Table Of Contents

mx.opt.rmsprop

Description

Create an RMSProp optimizer with respective parameters. Reference: Tieleman T, Hinton G. Lecture 6.5- Divide the gradient by a running average of its recent magnitude[J]. COURSERA: Neural Networks for Machine Learning, 2012, 4(2). The code follows: http://arxiv.org/pdf/1308.0850v5.pdf Eq(38) - Eq(45) by Alex Graves, 2013.

Create an RMSProp optimizer with respective parameters. Reference: Tieleman T, Hinton G. Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude[J]. COURSERA: Neural Networks for Machine Learning, 2012, 4(2). The code follows: http://arxiv.org/pdf/1308.0850v5.pdf Eq(38) - Eq(45) by Alex Graves, 2013.

Usage

mx.opt.rmsprop(learning.rate = 0.002, centered = TRUE, gamma1 = 0.95,

  gamma2 = 0.9, epsilon = 1e-04, wd = 0, rescale.grad = 1,

  clip_gradient = -1, lr_scheduler = NULL)

Arguments

Argument

Description

learning.rate

float, default=0.002.

The initial learning rate.

gamma1

float, default=0.95.

decay factor of moving average for gradient, gradient^2.

gamma2

float, default=0.9.

“momentum” factor.

epsilon

float, default=1e-4

wd

float, default=0.0.

L2 regularization coefficient add to all the weights.

rescale.grad

float, default=1.0.

rescaling factor of gradient.

clip_gradient

float, optional, default=-1 (no clipping if < 0).

clip gradient in range [-clip_gradient, clip_gradient].

lr_scheduler

function, optional.

The learning rate scheduler.