# LayerNorm¶

class LayerNorm(normalized_shape, eps=1e-05, affine=True, **kwargs)[源代码]

Applies Layer Normalization over a mini-batch of inputs Refer to Layer Normalization

$y = \frac{x - \mathrm{E}[x]}{ \sqrt{\mathrm{Var}[x] + \epsilon}} * \gamma + \beta$

The mean and standard-deviation are calculated separately over the last certain number dimensions which have to be of the shape specified by normalized_shape. $$\\gamma$$ and $$\\beta$$ are learnable affine transform parameters of normalized_shape if affine is True. The standard-deviation is calculated via the biased estimator.

• normalized_shape (int or tuple) – input shape from an expected input of size size $$[*, normalized\_shape[0], normalized\_shape[1], ..., normalized\_shape[-1]]$$. If it is a single integer, this module will normalize over the last dimension which is expected to be of that specific size.

• eps – a value added to the denominator for numerical stability. Default: 1e-5

• affine – this module has learnable affine parameters (weight, bias) when affine is set to be True.

Shape:
• Input: $$(N, *)$$ (2-D, 3-D, 4-D or 5-D tensor)

• Output: $$(N, *)$$ (same shape as input)

>>> import numpy as np
>>> inp = Tensor(np.arange(2 * 3 * 4 * 4).astype(np.float32).reshape(2, 3, 4, 4))
>>> m = M.LayerNorm((4, 4))
>>> out = m(inp)
>>> out.numpy().shape
(2, 3, 4, 4)