class SGD(params, lr, momentum=0.0, nesterov=False, weight_decay=0.0)[源代码]


This optimizer performs stochastic gradient descent with optional momentum and weight decay.

Nesterov momentum is based on the formula from “On the importance of initialization and momentum in deep learning”.

  • params (Union[Iterable[Parameter], dict]) – Iterable of parameters to optimize or dicts defining parameter groups.

  • lr (float) – Learning rate.

  • momentum (float) – Momentum factor. Default: 0.0.

  • nesterov (bool) – Enables Nesterov momentum. Default: False.

  • weight_decay (float) – Weight decay (L2 penalty). Default: 0.0.


An instance of the SGD optimizer.


