megengine.optimizer#

>>> import megengine.optimizer as optim

Base class for all optimizers.

常见优化器#

`SGD`	Implements stochastic gradient descent.
`AdamW`	Implements AdamW algorithm proposed in "Decoupled Weight Decay Regularization".
`Adam`	Implements Adam algorithm proposed in "Adam: A Method for Stochastic Optimization".
`Adagrad`	Implements Adagrad algorithm.
`Adadelta`	Implements Adadelta algorithm.
`LAMB`	Implements LAMB algorithm.
`LAMBFp16`

`LRScheduler`	Base class for all learning rate based schedulers.
`MultiStepLR`	Decays the learning rate of each parameter group by gamma once the

`clip_grad_norm`	Clips gradient norm of an iterable of parameters.
`clip_grad_value`	Clips gradient of an iterable of parameters to a specified lower and upper.