Use Module to define the model structure#

The neural network model is composed of various layers, or modules, that perform operations on input data.

../../../_images/Simplified-illustration-of-the-AlexNet-architecture.pbm

The above picture shows the classic AlexNet model structure (picture source ), which includes the classic convolutional layer conv and the fully connected layer fc module…

An abstraction of this structure is provided in the :mod:

  • Common neural network module interfaces such as: py:class:~.module.Conv2d are implemented in this namespace, which is convenient for users to quickly build model structures;

  • All modules are Module, please refer to Module base class concept and interface introduction;

  • In addition, a Sequential sequence container is provided, which is helpful when defining complex structures.

Warning

The capitalized Module'' in MegEngine refers to the base class that is frequently used by us when designing the model structure. It needs to be distinguished from the concept of lowercase ``module'' in Python. The latter refers to Files that can be imported. The statement ``import megengine.module as M'' actually imports the file module named ``module.py (usually abbreviated as M).

See also

This chapter mainly introduces the Float32 type Module used by default and the parameter initialization init module. The QAT Module and Quantized Module used in the quantization model will be introduced in the Quantization.

Basic usage example#

The following code demonstrates how to use the basic components of `` Module`` quickly design a convolution neural network structure:

  • All network structures are derived from the base class M.Module. In the constructor, you must first call super().__init__().

  • In the constructor, declare all layers/modules to be used;

  • In the forward function, define how the model will run, from input to output.

import megengine.functional as F
import megengine.module as M

class ConvNet(M.Module):
     def __init__(self):
         # this is the place where you instantiate all your modules
         # you can later access them using the same names you've given them in
         # here
         super().__init__()
         self.conv1 = M.Conv2d(1, 10, 5)
         self.pool1 = M.MaxPool2d(2, 2)
         self.conv2 = M.Conv2d(10, 20, 5)
         self.pool2 = M.MaxPool2d(2, 2)
         self.fc1 = M.Linear(320, 50)
         self.fc2 = M.Linear(50, 10)

     # it's the forward function that defines the network structure
     # we're accepting only a single input in here, but if you want,
     # feel free to use more
     def forward(self, input):
         x = self.pool1(F.relu(self.conv1(input)))
         x = self.pool2(F.relu(self.conv2(x)))
         x = F.flatten(x, 1)
         x = F.relu(self.fc1(x))
         x = F.relu(self.fc2(x))

         return x

Pay attention to the following points:

  • ConvNet is also a module Module, which is the same as Conv2d, Linear, which means it can be used as a substructure of other modules. This flexible nesting mechanism between ``Module’’ allows users to design very complex model structures in a relatively simple way.

  • In the process of defining the model, any Python code can be used to organize the model structure. The conditions and loop control flow statements are completely legal and can be handled well by the automatic differentiation mechanism. You can even create a loop in the fronthaul process, where the same modules are reused.

Let’s create an example and try it out:

>>> net = ConvNet()
>>> net
ConvNet(
  (conv1): Conv2d(1, 10, kernel_size=(5, 5))
  (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0)
  (conv2): Conv2d(10, 20, kernel_size=(5, 5))
  (pool2): MaxPool2d(kernel_size=2, stride=2, padding=0)
  (fc1): Linear(in_features=320, out_features=50, bias=True)
  (fc2): Linear(in_features=50, out_features=10, bias=True)
)

Note

All ``Modules’’ only support small batches of samples as input, not single samples.

For example, Conv2d is input as a 4-dimensional Tensor of nSamples x nChannels x Height x Width.

If you have a single sample, you need to use expand_dims to add a latitude.

We create a small batch of data containing a single sample (ie batch_size=1) and send it to ConvNet:

>>> input = megengine.Tensor(np.random.randn(1, 1, 28, 28))
>>> out = net(input)
>>> out.shape
(1, 10)

The output of ``ConvNet’’ is a Tensor, we can use it and the target label to calculate the loss, and then use the automatic derivation to complete the back propagation process. However, by default, all Tensors do not need to be differentiated, so before that we need to have a gradient manager to bind the ``Module’’ parameters and record the gradient information during the forward calculation. To understand this process, please refer to Basic principles and use of Autodiff.

More usage scenarios#

See also

The Module interface provides many useful attributes and methods, which can be conveniently used in different situations, such as:

  • Use .parameters() to easily obtain the iterator of parameters, which can be used to track gradients to facilitate automatic derivation;

  • Each Module has its own name name, and the name and corresponding Module'' can be obtained through ``.named_module();

  • Use .state_dict() and .load_state_dict() to get and load state information…

For more information, please refer to Module base class concept and interface introduction.