# Conv2d¶

class Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, conv_mode='cross_correlation', compute_mode='default', padding_mode='zeros', **kwargs)[源代码]

$\text{out}(N_i, C_{\text{out}_j}) = \text{bias}(C_{\text{out}_j}) + \sum_{k = 0}^{C_{\text{in}} - 1} \text{weight}(C_{\text{out}_j}, k) \star \text{input}(N_i, k)$

input: $$(N, C_{\text{in}}, H_{\text{in}}, W_{\text{in}})$$

output: $$(N, C_{\text{out}}, H_{\text{out}}, W_{\text{out}})$$ 在此式中

$\text{H}_{out} = \lfloor \frac{\text{H}_{in} + 2 * \text{padding[0]} - \text{dilation[0]} * (\text{kernel_size[0]} - 1) - 1}{\text{stride[0]}} + 1 \rfloor$
$\text{W}_{out} = \lfloor \frac{\text{W}_{in} + 2 * \text{padding[1]} - \text{dilation[1]} * (\text{kernel_size[1]} - 1) - 1}{\text{stride[1]}} + 1 \rfloor$

groups == in_channelsout_channels == K * in_channels ，其中 K 是正整数，该操作也被称为深度方向卷积（depthwise convolution）。

In other words, for an input of size $$(N, C_{\text{in}}, H_{\text{in}}, W_{\text{in}})$$, a depthwise convolution with a depthwise multiplier K, can be constructed by arguments $$(in\_channels=C_{\text{in}}, out\_channels=C_{\text{in}} \times K, ..., groups=C_{\text{in}})$$.

• in_channels (int) – 输入数据中的通道数。

• out_channels (int) – 输出数据中的通道数。

• kernel_size (Union[int, Tuple[int, int]]) – 空间维度上的权重大小。如果kernel_size 是一个 int, 实际的kernel大小为 (kernel_size, kernel_size).

• stride (Union[int, Tuple[int, int]]) – stride of the 2D convolution operation. Default: 1.

• padding (Union[int, Tuple[int, int]]) – size of the paddings added to the input on both sides of its spatial dimensions. Default: 0.

• dilation (Union[int, Tuple[int, int]]) – dilation of the 2D convolution operation. Default: 1.

• groups (int) – number of groups into which the input and output channels are divided, so as to perform a grouped convolution. When groups is not 1, in_channels and out_channels must be divisible by groups, and the shape of weight should be (groups, out_channel // groups, in_channels // groups, height, width). Default: 1.

• bias (bool) – whether to add a bias onto the result of convolution. Default: True.

• conv_mode (str) – supports cross_correlation. Default: cross_correlation.

• compute_mode (str) – when set to “default”, no special requirements will be placed on the precision of intermediate results. When set to “float32”, “float32” would be used for accumulator and intermediate result, but only effective when input and output are of float16 dtype. Default: default.

Shape:

input: $$(N, C_{\text{in}}, H_{\text{in}}, W_{\text{in}})$$. output: $$(N, C_{\text{out}}, H_{\text{out}}, W_{\text{out}})$$.

• weight 的shape通常为 (out_channels, in_channels, height, width) ,

如果 groups 不为 1, shape 应该是 (groups, out_channels // groups, in_channels // groups, height, width)

• bias 的shape通常为 (1, out_channels, *1)

module. The instance of the Conv2d module.

Return type

>>> import numpy as np
>>> m = M.Conv2d(in_channels=3, out_channels=1, kernel_size=3)
>>> inp = mge.tensor(np.arange(0, 96).astype("float32").reshape(2, 3, 4, 4))
>>> oup = m(inp)
>>> oup.numpy().shape
(2, 1, 2, 2)