megengine.amp.GradScaler¶
- class GradScaler(init_scale=2.0 ** 4, growth_factor=2.0, backoff_factor=0.5, growth_interval=2000)[源代码]¶
- A helper class that performs grad scaling to prevent from data overflow in - autocastmode.- 参数
- init_scale ( - float) – Initial scale factor.
- growth_factor ( - float) – Factor that the scale is multiplied by in actual- updatestage. If growth_factor is 0, scale_factor will not update.
- backoff_factor ( - float) – Factor that the scale is multiplied by when encountering overflow grad.
- growth_interval ( - int) – The interval between two scale update stages.
 
 - 示例 - gm = GradManager() opt = ... scaler = GradScaler() gm.attach(model.parameters()) @autocast() def train_step(image, label): with gm: logits = model(image) loss = F.nn.cross_entropy(logits, label) scaler.backward(gm, loss) opt.step().clear_grad() return loss - If need more flexible usage, could split - scaler.backwardinto three lines:- @autocast() def train_step(image, label): with gm: logits = model(image) loss = F.nn.cross_entropy(logits, label) gm.backward(loss, dy=megengine.tensor(scaler.scale_factor)) scaler.unscale(gm.attached_tensors()) scaler.update() opt.step().clear_grad() return loss- This is useful when need to accumulate grads for multi batches. - Methods - backward(gm[, y, dy, unscale_grad, update_scale])- A wrapper of GradManager's - backward, used to scale- y's grad and unscale parameters' grads.- load_state_dict(state)- unscale(grad_tensors)- Unscale all - grad_tensors's grad.- update([new_scale])- Update the scale factor according to whether encountered overflow grad.