class Adadelta(params, lr=1.0, rho=0.9, eps=1e-6, weight_decay=0.0)[源代码]#

Implements Adadelta algorithm.

It has been proposed in “ADADELTA: An Adaptive Learning Rate Method”.

  • params (Union[Iterable[Parameter], dict]) – 可迭代对象,可以是一组待优化的参数,或定义几组参数的dict类型。

  • lr (float) – coefficient that scales delta before it is applied to the parameters. Default: 1.0

  • rho (float) – coefficient used for computing a running average of squared gradients. Default: 0.9

  • eps (float) – term added to the denominator to improve numerical stability. Default: 1e-6

  • weight_decay (float) – weight decay (L2 penalty). Default: 0