What Is GRU (Gated Recurrent Unit)

Neural Network Tutorials - Herong's Tutorial Examples

∟What Is GRU (Gated Recurrent Unit)

This section provides a quick introduction of GRU (Gated Recurrent Unit), which is a simplified version of the LSTM (Long Short-Term Memory) recurrent neural network model. GRU uses only one state vector and two gate vectors, reset gate and update gate.

What Is GRU (Gated Recurrent Unit)? GRU is a simplified version of the LSTM (Long Short-Term Memory) recurrent neural network model. GRU uses only one state vector and two gate vectors, reset gate and update gate, as described in this tutorial.

1. If we follow the same presentation style as the lSTM model used in the previous tutorial, we can present the GRU model as information flow diagram as shown below (on the right).

GRU Model vs. LSTM Model — LSTM Model vs. GRU Model

2. Similar to the LSTM model, the reset gate vector and the update gate vector are calculated as below.

Reset gate vector:
  r = sigmoid(Wgr_t · x_t + Ugr_t · s_t-1)

Update gate vector:
  u = sigmoid(Wgu_t · x_t + Ugu_t · s_t-1)

3. Note that reset gate vector, r, creates only one gate function to control the flow of s_t-1 into the input recursive function Ri(). But the update gate vector, u, creates a pair of gate functions to control the flow of s_t-1 and Ri() into final outputs, s_t and y_t.