SGD#

class SGD(params, lr, momentum=0.0, nesterov=False, weight_decay=0.0)[源代码]#

实现随机梯度下降。

Nesterov momentum is based on the formula from “On the importance of initialization and momentum in deep learning” .

参数:
  • params (Union[Iterable[Parameter], dict]) – iterable of parameters to optimize or dicts defining parameter groups.

  • lr (float) – learning rate.

  • momentum (float) – momentum factor. Default: 0.0

  • nesterov (bool) – enables Nesterov momentum. Default: False

  • weight_decay (float) – weight decay (L2 penalty). Default: 0.0