Adagrad#
- class Adagrad(params, lr=1e-2, lr_decay=0.0, eps=1e-10, weight_decay=0.0)[source]#
Implements Adagrad algorithm.
It has been proposed in “Adaptive Subgradient Methods for Online Learning and Stochastic Optimization”.
- Parameters:
params (
Union
[Iterable
[Parameter
],dict
]) – iterable of parameters to optimize or dicts defining parameter groups.lr (
float
) – coefficient that scales delta before it is applied to the parameters. Default: 1e-2lr_decay (
float
) – learning rate decay. Default: 0eps (
float
) – term added to the denominator to improve numerical stability. Default: 1e-10weight_decay (
float
) – weight decay (L2 penalty). Default: 0