## RNN ##

### 参数 ###

**Parameters**

input\_size – The number of expected features in the input x
hidden\_size – The number of features in the hidden state h
num\_layers – Number of recurrent layers. E.g., setting num\_layers=2 would mean stacking two RNNs together to form a stacked RNN, with the second RNN taking in outputs of the first RNN and computing the final results. Default: 1 堆叠层数
nonlinearity – The non-linearity to use. Can be either 'tanh' or 'relu'. Default: 'tanh'
bias – If False, then the layer does not use bias weights b\_ih and b\_hh. Default: True
batch\_first – If True, then the input and output tensors are provided as (batch, seq, feature). Default: False
dropout – If non-zero, introduces a Dropout layer on the outputs of each RNN layer except the last layer, with dropout probability equal to dropout. Default: 0
bidirectional – If True, becomes a bidirectional RNN. Default: False 是否使用双向rnn。

Note: RNN这里的序列长度,是动态的,不写在参数里的,具体会由输入的input参数而定。

**Inputs: input, h\_0**

input维度
input of shape (seq\_len, batch, input\_size): tensor containing the features of the input sequence. The input can also be a packed variable length sequence. See torch.nn.utils.rnn.pack\_padded\_sequence() or torch.nn.utils.rnn.pack\_sequence() for details.

h0维度
h\_0 of shape (num\_layers \* num\_directions, batch, hidden\_size): tensor containing the initial hidden state for each element in the batch. Defaults to zero if not provided. If the RNN is bidirectional, num\_directions should be 2, else it should be 1.h0是提供给每层RNN的初始输入,所有num\_layers要和RNN的num\_layers对得上。

**Outputs: output, h\_n**

output of shape (seq\_len, batch, num\_directions \* hidden\_size): tensor containing the output features (h\_t) from the last layer of the RNN, for each t. If a torch.nn.utils.rnn.PackedSequence has been given as the input, the output will also be a packed sequence.For the unpacked case, the directions can be separated using output.view(seq\_len, batch, num\_directions, hidden\_size), with forward and backward being direction 0 and 1 respectively. Similarly, the directions can be separated in the packed case.RNN的上侧输出。

h\_n of shape (num\_layers \* num\_directions, batch, hidden\_size): tensor containing the hidden state for t = seq\_len.Like output, the layers can be separated using h\_n.view(num\_layers, num\_directions, batch, hidden\_size).RNN的右侧输出,如果是双向的话,就还有一个左侧输出。

### 示例 ###

rnn=nn.RNN(10,20,2) \#(each\_input\_size, hidden\_state, num\_layers)
input=torch.randn(5,3,10) \# (seq\_len, batch, input\_size)
h0=torch.randn(2,3,20) \#(num\_layers \* num\_directions, batch, hidden\_size)
output,hn=rnn(input,h0)
print(output.size(),hn.size())

## LSTM ##

### 示例 ###

rnn=nn.LSTM(10,20,2) \#(each\_input\_size, hidden\_state, num\_layers)
input=torch.randn(5,3,10) \# (seq\_len, batch, input\_size)
h0=torch.randn(2,3,20) \#(num\_layers \* num\_directions, batch, hidden\_size)
c0=torch.randn(2,3,20) \#(num\_layers \* num\_directions, batch, hidden\_size)
output,(hn,cn)=rnn(input,(h0,c0)) \#seq\_len x batch x hidden\*bi\_directional
print(output.size(),hn.size(),cn.size())

## GRU ##

gru = nn.GRU(embed\_size, hidden\_size, n\_layers, dropout=dropout, bidirectional=True)

### 示例 ###

import torch
import torch.nn as nn

rnn = nn.GRU(2, 4, 2,bidirectional=True)
input = torch.randn(2, 2, 2)
h0 = torch.randn(4, 2, 4)
output, hn = rnn(input, h0)
print(output)
print(hn)
print(output.size(),hn.size())