# Conv 差异对比¶

$\operatorname{out}\left(N_i, C_{\text {out }_j}\right) =\operatorname{bias}\left(C_{\text {out }_j}\right)+ \sum_{k=0}^{C_{\text {in }}-1} \text { weight }\left(C_{\text {out }_j}, k\right) \star \operatorname{input}\left(N_i, k\right)$

Weight 形状不同

Pytorch 的 weight 形状为 (out_channels, in_channels // groups, kernel_size...), 而 MegEngine 的 weight 在 groups=1 时为 (out_channels, in_channels, kernel_size), 其它情况下为 (groups, out_channels // groups, in_channels // groups, kernel_size...).

• Conv1d - kernel_sizekernel_length

• Conv2d - kernel_sizekernel_height, kernel_width

• Conv3d - kernel_sizekernel_depth, kernel_height, kernel_width

import megengine
import torch

m_conv = megengine.module.Conv2d(10, 20, kernel_size=3, padding=1, groups=2)
t_conv = torch.nn.Conv2d(10, 20, kernel_size=3, padding=1, groups=2)
print(m_conv.weight.shape) # (2, 10, 5, 3, 3)
print(t_conv.weight.shape) # torch.Size([20, 5, 3, 3])


Bias 形状不同

Pytorch 的 bias 形状为 (out_channels,), 而 MegEngine 的 bias 形状为 (1, out_channels, dims...), 省略的维度为多个 1.

• Conv1d - dims1

• Conv2d - dims1, 1

• Conv3d - dims1, 1, 1

import megengine
import torch

m_conv = megengine.module.Conv2d(10, 20, kernel_size=3)
t_conv = torch.nn.Conv2d(10, 20, kernel_size=3)
print(m_conv.bias.shape) # (1, 20, 1, 1)
print(t_conv.bias.shape) # (torch.Size([20])