Table Of Contents
Table Of Contents


mxnet.ndarray.Correlation(data1=None, data2=None, kernel_size=_Null, max_displacement=_Null, stride1=_Null, stride2=_Null, pad_size=_Null, is_multiply=_Null, out=None, name=None, **kwargs)

Applies correlation to inputs.

The correlation layer performs multiplicative patch comparisons between two feature maps.

Given two multi-channel feature maps \(f_{1}, f_{2}\), with \(w\), \(h\), and \(c\) being their width, height, and number of channels, the correlation layer lets the network compare each patch from \(f_{1}\) with each patch from \(f_{2}\).

For now we consider only a single comparison of two patches. The ‘correlation’ of two patches centered at \(x_{1}\) in the first map and \(x_{2}\) in the second map is then defined as:

\[c(x_{1}, x_{2}) = \sum_{o \in [-k,k] \times [-k,k]} <f_{1}(x_{1} + o), f_{2}(x_{2} + o)>\]

for a square patch of size \(K:=2k+1\).

Note that the equation above is identical to one step of a convolution in neural networks, but instead of convolving data with a filter, it convolves data with other data. For this reason, it has no training weights.

Computing \(c(x_{1}, x_{2})\) involves \(c * K^{2}\) multiplications. Comparing all patch combinations involves \(w^{2}*h^{2}\) such computations.

Given a maximum displacement \(d\), for each location \(x_{1}\) it computes correlations \(c(x_{1}, x_{2})\) only in a neighborhood of size \(D:=2d+1\), by limiting the range of \(x_{2}\). We use strides \(s_{1}, s_{2}\), to quantize \(x_{1}\) globally and to quantize \(x_{2}\) within the neighborhood centered around \(x_{1}\).

The final output is defined by the following expression:

\[out[n, q, i, j] = c(x_{i, j}, x_{q})\]

where \(i\) and \(j\) enumerate spatial locations in \(f_{1}\), and \(q\) denotes the \(q^{th}\) neighborhood of \(x_{i,j}\).

Defined in src/operator/

  • data1 (NDArray) – Input data1 to the correlation.

  • data2 (NDArray) – Input data2 to the correlation.

  • kernel_size (int (non-negative), optional, default=1) – kernel size for Correlation must be an odd number

  • max_displacement (int (non-negative), optional, default=1) – Max displacement of Correlation

  • stride1 (int (non-negative), optional, default=1) – stride1 quantize data1 globally

  • stride2 (int (non-negative), optional, default=1) – stride2 quantize data2 within the neighborhood centered around data1

  • pad_size (int (non-negative), optional, default=0) – pad for Correlation

  • is_multiply (boolean, optional, default=1) – operation type is either multiplication or subduction

  • out (NDArray, optional) – The output NDArray to hold the result.


out – The output of this function.

Return type

NDArray or list of NDArrays