RNNCell#

class RNNCell(input_size, hidden_size, bias=True, nonlinearity='tanh')[源代码]#

一个使用 tanh 或 ReLU 非线性的 Elman RNN 单元

\[h' = \tanh(W_{ih} x + b_{ih} + W_{hh} h + b_{hh})\]

如果 nonlinearity‘relu’ ,那么会使用 ReLU 代替 tanh。

参数:
  • input_size (int) – The number of expected features in the input x

  • hidden_size (int) – The number of features in the hidden state h

  • bias (bool) – If False, then the layer does not use bias weights b_ih and b_hh. Default: True

  • nonlinearity (str) – 需要使用的非线性函数。可以是 'tanh''relu' 。默认值: 'tanh'

输入:input,hidden
  • input of shape (batch, input_size): tensor containing input features

  • hidden of shape (batch, hidden_size): tensor containing the initial hidden state for each element in the batch. Defaults to zero if not provided.

输出: h’
  • h’ of shape (batch, hidden_size): tensor containing the next hidden state for each element in the batch

形状:
  • Input1: \((N, H_{in})\) tensor containing input features where \(H_{in}\) = input_size

  • Input2: \((N, H_{out})\) tensor containing the initial hidden state for each element in the batch where \(H_{out}\) = hidden_size Defaults to zero if not provided.

  • Output: \((N, H_{out})\) tensor containing the next hidden state for each element in the batch

实际案例

import numpy as np
import megengine as mge
import megengine.module as M

m = M.RNNCell(10, 20)
inp = mge.tensor(np.random.randn(3, 10), dtype=np.float32)
hx = mge.tensor(np.random.randn(3, 20), dtype=np.float32)
out = m(inp, hx)
print(out.numpy().shape)

输出:

(3, 20)