Table Of Contents
Table Of Contents


mxnet.autograd.grad(heads, variables, head_grads=None, retain_graph=None, create_graph=False, train_mode=True)[source]

Compute the gradients of heads w.r.t variables. Gradients will be returned as new NDArrays instead of stored into variable.grad. Supports recording gradient graph for computing higher order gradients.


Currently only a very limited set of operators support higher order gradients.

  • heads (NDArray or list of NDArray) – Output NDArray(s)

  • variables (NDArray or list of NDArray) – Input variables to compute gradients for.

  • head_grads (NDArray or list of NDArray or None) – Gradients with respect to heads.

  • retain_graph (bool) – Whether to keep computation graph to differentiate again, instead of clearing history and release memory. Defaults to the same value as create_graph.

  • create_graph (bool) – Whether to record gradient graph for computing higher order

  • train_mode (bool, optional) – Whether to do backward for training or prediction.


Gradients with respect to variables.

Return type

NDArray or list of NDArray


>>> x = mx.nd.ones((1,))
>>> x.attach_grad()
>>> with mx.autograd.record():
...     z = mx.nd.elemwise_add(mx.nd.exp(x), x)
>>> dx = mx.autograd.grad(z, [x], create_graph=True)
>>> print(dx)
[ 3.71828175]
<NDArray 1 @cpu(0)>]