megengine.functional.nn.cross_entropy#

cross_entropy(pred, label, axis=1, with_logits=True, label_smooth=0, reduction='mean')[source]#

Computes the multi-class cross entropy loss (using logits by default).

When using label smoothing, the label distribution is as follows:

\[y^{LS}_{k}=y_{k}\left(1-\alpha\right)+\alpha/K\]

where \(y^{LS}\) and \(y\) are new label distribution and origin label distribution respectively. k is the index of label distribution. \(\alpha\) is label_smooth and \(K\) is the number of classes.

Parameters:

pred (Tensor) – input tensor representing the predicted value.
label (Tensor) – input tensor representing the classification label.
axis (int) – an axis along which softmax will be applied. Default: 1
with_logits (bool) – whether to apply softmax first. Default: True
label_smooth (float) – a label smoothing of parameter that can re-distribute target distribution. Default: 0
reduction (str) – the reduction to apply to the output: ‘none’ | ‘mean’ | ‘sum’.

Return type:

Tensor

Returns:

loss value.

Examples

By default(with_logitis is True), pred is assumed to be logits, class probabilities are given by softmax. It has better numerical stability compared with sequential calls to softmax and cross_entropy.

>>> pred = Tensor([[0., 1.], [0.3, 0.7], [0.7, 0.3]])
>>> label = Tensor([1., 1., 1.])
>>> F.nn.cross_entropy(pred, label)  
Tensor(0.57976407, device=xpux:0)
>>> F.nn.cross_entropy(pred, label, reduction="none")
Tensor([0.3133 0.513  0.913 ], device=xpux:0)

If the pred value has been probabilities, set with_logits to False:

>>> pred = Tensor([[0., 1.], [0.3, 0.7], [0.7, 0.3]])
>>> label = Tensor([1., 1., 1.])
>>> F.nn.cross_entropy(pred, label, with_logits=False)  
Tensor(0.5202159, device=xpux:0)
>>> F.nn.cross_entropy(pred, label, with_logits=False, reduction="none")
Tensor([0.     0.3567 1.204 ], device=xpux:0)