mxnet.ndarray.Correlation¶

mxnet.ndarray.Correlation(data1=None, data2=None, kernel_size=_Null, max_displacement=_Null, stride1=_Null, stride2=_Null, pad_size=_Null, is_multiply=_Null, out=None, name=None, **kwargs)

Applies correlation to inputs.

The correlation layer performs multiplicative patch comparisons between two feature maps.

Given two multi-channel feature maps $$f_{1}, f_{2}$$, with $$w$$, $$h$$, and $$c$$ being their width, height, and number of channels, the correlation layer lets the network compare each patch from $$f_{1}$$ with each patch from $$f_{2}$$.

For now we consider only a single comparison of two patches. The ‘correlation’ of two patches centered at $$x_{1}$$ in the first map and $$x_{2}$$ in the second map is then defined as:

$c(x_{1}, x_{2}) = \sum_{o \in [-k,k] \times [-k,k]} <f_{1}(x_{1} + o), f_{2}(x_{2} + o)>$

for a square patch of size $$K:=2k+1$$.

Note that the equation above is identical to one step of a convolution in neural networks, but instead of convolving data with a filter, it convolves data with other data. For this reason, it has no training weights.

Computing $$c(x_{1}, x_{2})$$ involves $$c * K^{2}$$ multiplications. Comparing all patch combinations involves $$w^{2}*h^{2}$$ such computations.

Given a maximum displacement $$d$$, for each location $$x_{1}$$ it computes correlations $$c(x_{1}, x_{2})$$ only in a neighborhood of size $$D:=2d+1$$, by limiting the range of $$x_{2}$$. We use strides $$s_{1}, s_{2}$$, to quantize $$x_{1}$$ globally and to quantize $$x_{2}$$ within the neighborhood centered around $$x_{1}$$.

The final output is defined by the following expression:

$out[n, q, i, j] = c(x_{i, j}, x_{q})$

where $$i$$ and $$j$$ enumerate spatial locations in $$f_{1}$$, and $$q$$ denotes the $$q^{th}$$ neighborhood of $$x_{i,j}$$.

Defined in src/operator/correlation.cc:L198

Parameters
• data1 (NDArray) – Input data1 to the correlation.

• data2 (NDArray) – Input data2 to the correlation.

• kernel_size (int (non-negative), optional, default=1) – kernel size for Correlation must be an odd number

• max_displacement (int (non-negative), optional, default=1) – Max displacement of Correlation

• stride1 (int (non-negative), optional, default=1) – stride1 quantize data1 globally

• stride2 (int (non-negative), optional, default=1) – stride2 quantize data2 within the neighborhood centered around data1