RNNCell¶

class RNNCell(input_size, hidden_size, bias=True, nonlinearity='tanh')[源代码]

An Elman RNN cell with tanh or ReLU non-linearity.

$h' = \tanh(W_{ih} x + b_{ih} + W_{hh} h + b_{hh})$

If nonlinearity is ‘relu’, then ReLU is used in place of tanh.

• input_size (int) – The number of expected features in the input x

• hidden_size (int) – The number of features in the hidden state h

• bias (bool) – If False, then the layer does not use bias weights b_ih and b_hh. Default: True

• nonlinearity (str) – The non-linearity to use. Can be either 'tanh' or 'relu'. Default: 'tanh'

Inputs: input, hidden
• input of shape (batch, input_size): tensor containing input features

• hidden of shape (batch, hidden_size): tensor containing the initial hidden state for each element in the batch. Defaults to zero if not provided.

Outputs: h’
• h’ of shape (batch, hidden_size): tensor containing the next hidden state for each element in the batch

• Input1: $$(N, H_{in})$$ tensor containing input features where $$H_{in}$$ = input_size

• Input2: $$(N, H_{out})$$ tensor containing the initial hidden state for each element in the batch where $$H_{out}$$ = hidden_size Defaults to zero if not provided.

• Output: $$(N, H_{out})$$ tensor containing the next hidden state for each element in the batch

import numpy as np
import megengine as mge
import megengine.module as M

m = M.RNNCell(10, 20)
inp = mge.tensor(np.random.randn(3, 10), dtype=np.float32)
hx = mge.tensor(np.random.randn(3, 20), dtype=np.float32)
out = m(inp, hx)
print(out.numpy().shape)


(3, 20)