megengine.optimizer#

>>> import megengine.optimizer as optim

Optimizer

Base class for all optimizers.

常见优化器#

SGD

Implements stochastic gradient descent.

AdamW

Implements AdamW algorithm proposed in "Decoupled Weight Decay Regularization".

Adam

Implements Adam algorithm proposed in "Adam: A Method for Stochastic Optimization".

Adagrad

Implements Adagrad algorithm.

Adadelta

Implements Adadelta algorithm.

LAMB

Implements LAMB algorithm.

LAMBFp16

学习率调整#

LRScheduler

Base class for all learning rate based schedulers.

MultiStepLR

Decays the learning rate of each parameter group by gamma once the

梯度处理#

clip_grad_norm

Clips gradient norm of an iterable of parameters.

clip_grad_value

Clips gradient of an iterable of parameters to a specified lower and upper.