Conv3d¶

class Conv3d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, conv_mode='cross_correlation')[源代码]¶

对输入 tensor 进行三维卷积

例如，给一个大小为 $(N, C_{in}, T, H, W)$ 的输出:

out (N_{i}, C_{{out}_{j}}) = bias (C_{{out}_{j}}) + \sum_{k = 0}^{C_{in} - 1} weight (C_{{out}_{j}}, k) ⋆ input (N_{i}, k)

在此式子中 $⋆$ 是有效的 3D 互相关（cross-correlation）运算符, $N$ 是 batch 大小, $C$ 表示 channels 数量。

当 groups == in_channels 且 out_channels == K * in_channels ，其中 K 是正整数，该操作也被称为深度方向卷积（depthwise convolution）。

换言之, 对于大小为 $(N, C_{i n}, T_{i n t}, H_{i n}, W_{i n})$ 的 depthwise 卷积, 可以通过参数构造 $(i n_c h a n n e l s = C_{i n}, o u t_c h a n n e l s = C_{i n} \times K, . . ., g r o u p s = C_{i n})$ .

参数

in_channels (int) – 输入数据中的通道数。
out_channels (int) – 输出数据中的通道数。
kernel_size (Union[int, Tuple[int, int, int]]) – 空间维度上的权重大小。如果kernel_size 是一个 int, 实际的kernel大小为 (kernel_size, kernel_size, kernel_size)。
stride (Union[int, Tuple[int, int, int]]) – 三维卷积运算中的步长。默认： 1
padding (Union[int, Tuple[int, int, int]]) – 输入数据空域维度两侧的填充（padding）大小。仅支持填充0值。默认：0
dilation (Union[int, Tuple[int, int, int]]) – 三维卷积运算的扩张值(dilation)。默认： 1
groups (int) – 输入输出的通道被划分的组的数量, 以便执行 grouped convolution. 当 groups 不为 1, in_channels 和 out_channels 必须能被``groups``整除, 并且weight的shape应该是 (groups, out_channel // groups, in_channels // groups, depth, height, width). 默认值: 1
bias (bool) – 是否将偏置（bias）加入卷积的结果中。默认：True
conv_mode (str) – 支持 cross_correlation. 默认: cross_correlation

注解

weight 的shape通常是 (out_channels, in_channels, depth, height, width) , 如果 groups 不为1, shape 将是 (groups, out_channels // groups, in_channels // groups, depth, height, width)
bias 的shape通常是 (1, out_channels, *1)

实际案例

import numpy as np
import megengine as mge
import megengine.module as M

m = M.Conv3d(in_channels=3, out_channels=1, kernel_size=3)
inp = mge.tensor(np.arange(0, 384).astype("float32").reshape(2, 3, 4, 4, 4))
oup = m(inp)
print(oup.numpy().shape)

输出：

(2, 1, 2, 2, 2)

Conv2d

ConvTranspose2d