Table Of Contents
Table Of Contents

RNN

class mxnet.gluon.rnn.RNN(hidden_size, num_layers=1, activation='relu', layout='TNC', dropout=0, bidirectional=False, i2h_weight_initializer=None, h2h_weight_initializer=None, i2h_bias_initializer='zeros', h2h_bias_initializer='zeros', input_size=0, **kwargs)[source]

Applies a multi-layer Elman RNN with tanh or ReLU non-linearity to an input sequence.

For each element in the input sequence, each layer computes the following function:

\[h_t = \tanh(w_{ih} * x_t + b_{ih} + w_{hh} * h_{(t-1)} + b_{hh})\]

where \(h_t\) is the hidden state at time t, and \(x_t\) is the output of the previous layer at time t or \(input_t\) for the first layer. If nonlinearity=’relu’, then ReLU is used instead of tanh.

Parameters:
  • hidden_size (int) – The number of features in the hidden state h.
  • num_layers (int, default 1) – Number of recurrent layers.
  • activation ({'relu' or 'tanh'}, default 'relu') – The activation function to use.
  • layout (str, default 'TNC') – The format of input and output tensors. T, N and C stand for sequence length, batch size, and feature dimensions respectively.
  • dropout (float, default 0) – If non-zero, introduces a dropout layer on the outputs of each RNN layer except the last layer.
  • bidirectional (bool, default False) – If True, becomes a bidirectional RNN.
  • i2h_weight_initializer (str or Initializer) – Initializer for the input weights matrix, used for the linear transformation of the inputs.
  • h2h_weight_initializer (str or Initializer) – Initializer for the recurrent weights matrix, used for the linear transformation of the recurrent state.
  • i2h_bias_initializer (str or Initializer) – Initializer for the bias vector.
  • h2h_bias_initializer (str or Initializer) – Initializer for the bias vector.
  • input_size (int, default 0) – The number of expected features in the input x. If not specified, it will be inferred from input.
  • prefix (str or None) – Prefix of this Block.
  • params (ParameterDict or None) – Shared Parameters for this Block.
Inputs:
  • data: input tensor with shape (sequence_length, batch_size, input_size) when layout is “TNC”. For other layouts, dimensions are permuted accordingly using transpose() operator which adds performance overhead. Consider creating batches in TNC layout during data batching step.
  • states: initial recurrent state tensor with shape (num_layers, batch_size, num_hidden). If bidirectional is True, shape will instead be (2*num_layers, batch_size, num_hidden). If states is None, zeros will be used as default begin states.
Outputs:
  • out: output tensor with shape (sequence_length, batch_size, num_hidden) when layout is “TNC”. If bidirectional is True, output shape will instead be (sequence_length, batch_size, 2*num_hidden)
  • out_states: output recurrent state tensor with the same shape as states. If states is None out_states will not be returned.

Examples

>>> layer = mx.gluon.rnn.RNN(100, 3)
>>> layer.initialize()
>>> input = mx.nd.random.uniform(shape=(5, 3, 10))
>>> # by default zeros are used as begin state
>>> output = layer(input)
>>> # manually specify begin state.
>>> h0 = mx.nd.random.uniform(shape=(3, 3, 100))
>>> output, hn = layer(input, h0)
__init__(hidden_size, num_layers=1, activation='relu', layout='TNC', dropout=0, bidirectional=False, i2h_weight_initializer=None, h2h_weight_initializer=None, i2h_bias_initializer='zeros', h2h_bias_initializer='zeros', input_size=0, **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

Methods

__init__(hidden_size[, num_layers, …]) Initialize self.
apply(fn) Applies fn recursively to every child block as well as self.
begin_state([batch_size, func]) Initial state for this cell.
cast(dtype) Cast this Block to use another data type.
collect_params([select]) Returns a ParameterDict containing this Block and all of its children’s Parameters(default), also can returns the select ParameterDict which match some given regular expressions.
export(path[, epoch]) Export HybridBlock to json format that can be loaded by SymbolBlock.imports, mxnet.mod.Module or the C++ interface.
forward(x, *args) Defines the forward computation.
hybrid_forward(F, inputs[, states]) Overrides to construct symbolic graph for this Block.
hybridize([active]) Activates or deactivates HybridBlock s recursively.
infer_shape(*args) Infers shape of Parameters from inputs.
infer_type(*args) Infers data type of Parameters from inputs.
initialize([init, ctx, verbose, force_reinit]) Initializes Parameter s of this Block and its children.
load_parameters(filename[, ctx, …]) Load parameters from file previously saved by save_parameters.
load_params(filename[, ctx, allow_missing, …]) [Deprecated] Please use load_parameters.
name_scope() Returns a name space object managing a child Block and parameter names.
register_child(block[, name]) Registers block as a child of self.
register_forward_hook(hook) Registers a forward hook on the block.
register_forward_pre_hook(hook) Registers a forward pre-hook on the block.
save_parameters(filename) Save parameters to file.
save_params(filename) [Deprecated] Please use save_parameters.
state_info([batch_size])
summary(*inputs) Print the summary of the model’s output and parameters.

Attributes

name Name of this Block, without ‘_’ in the end.
params Returns this Block’s parameter dictionary (does not include its children’s parameters).
prefix Prefix of this Block.