From b197cdfeb7af360550b956bd1003375c9892815f Mon Sep 17 00:00:00 2001 From: "Documenter.jl" Date: Sat, 14 Dec 2024 20:39:12 +0000 Subject: [PATCH] build based on 9033eb3 --- previews/PR27/.documenter-siteinfo.json | 2 +- previews/PR27/api/cells/index.html | 31 +++++++++++++------------ previews/PR27/api/wrappers/index.html | 28 +++++++++++----------- previews/PR27/index.html | 2 +- previews/PR27/roadmap/index.html | 2 +- previews/PR27/search_index.js | 2 +- 6 files changed, 34 insertions(+), 33 deletions(-) diff --git a/previews/PR27/.documenter-siteinfo.json b/previews/PR27/.documenter-siteinfo.json index a175f85..72c527b 100644 --- a/previews/PR27/.documenter-siteinfo.json +++ b/previews/PR27/.documenter-siteinfo.json @@ -1 +1 @@ -{"documenter":{"julia_version":"1.11.2","generation_timestamp":"2024-12-14T20:24:46","documenter_version":"1.8.0"}} \ No newline at end of file +{"documenter":{"julia_version":"1.11.2","generation_timestamp":"2024-12-14T20:39:06","documenter_version":"1.8.0"}} \ No newline at end of file diff --git a/previews/PR27/api/cells/index.html b/previews/PR27/api/cells/index.html index b7bbbfe..e841d9d 100644 --- a/previews/PR27/api/cells/index.html +++ b/previews/PR27/api/cells/index.html @@ -17,31 +17,31 @@ #result with default initialization of internal states result = rancell(inp) #result with internal states provided -result_state = rancell(inp, (state, c_state))source
RecurrentLayers.IndRNNCellType
IndRNNCell((input_size => hidden_size)::Pair, σ=relu;
+result_state = rancell(inp, (state, c_state))
source
RecurrentLayers.IndRNNCellType
IndRNNCell((input_size => hidden_size)::Pair, σ=relu;
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
-    bias = true)

Independently recurrent cell. See IndRNN for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • σ: activation function. Default is tanh
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\mathbf{h}_{t} = \sigma(\mathbf{W} \mathbf{x}_t + \mathbf{u} \odot \mathbf{h}_{t-1} + \mathbf{b})\]

Forward

indrnncell(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the indrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the IndRNNCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.

source
RecurrentLayers.LightRUCellType
LightRUCell((input_size => hidden_size)::Pair;
+    bias = true)

Independently recurrent cell. See IndRNN for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • σ: activation function. Default is tanh
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\mathbf{h}_{t} = \sigma(\mathbf{W} \mathbf{x}_t + \mathbf{u} \odot \mathbf{h}_{t-1} + \mathbf{b})\]

Forward

indrnncell(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the indrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the IndRNNCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.

source
RecurrentLayers.LightRUCellType
LightRUCell((input_size => hidden_size)::Pair;
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Light recurrent unit. See LightRU for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} \tilde{h}_t &= \tanh(W_h x_t), \\ f_t &= \delta(W_f x_t + U_f h_{t-1} + b_f), \\ h_t &= (1 - f_t) \odot h_{t-1} + f_t \odot \tilde{h}_t. -\end{aligned}\]

Forward

lightrucell(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the lightrucell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the LightRUCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.

source
RecurrentLayers.LiGRUCellType
LiGRUCell((input_size => hidden_size)::Pair;
+\end{aligned}\]

Forward

lightrucell(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the lightrucell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the LightRUCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.

source
RecurrentLayers.LiGRUCellType
LiGRUCell((input_size => hidden_size)::Pair;
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Light gated recurrent unit. The implementation does not include the batch normalization as described in the original paper. See LiGRU for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} z_t &= \sigma(W_z x_t + U_z h_{t-1}), \\ \tilde{h}_t &= \text{ReLU}(W_h x_t + U_h h_{t-1}), \\ h_t &= z_t \odot h_{t-1} + (1 - z_t) \odot \tilde{h}_t -\end{aligned}\]

Forward

ligrucell(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the ligrucell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the LiGRUCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.

source
RecurrentLayers.MGUCellType
MGUCell((input_size => hidden_size)::Pair;
+\end{aligned}\]

Forward

ligrucell(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the ligrucell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the LiGRUCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.

source
RecurrentLayers.MGUCellType
MGUCell((input_size => hidden_size)::Pair;
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Minimal gated unit. See MGU for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} f_t &= \sigma(W_f x_t + U_f h_{t-1} + b_f), \\ \tilde{h}_t &= \tanh(W_h x_t + U_h (f_t \odot h_{t-1}) + b_h), \\ h_t &= (1 - f_t) \odot h_{t-1} + f_t \odot \tilde{h}_t -\end{aligned}\]

Forward

mgucell(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the mgucell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the MGUCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.

source
RecurrentLayers.NASCellType
NASCell((input_size => hidden_size);
+\end{aligned}\]

Forward

mgucell(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the mgucell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the MGUCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.

source
RecurrentLayers.NASCellType
NASCell((input_size => hidden_size);
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Neural Architecture Search unit. See NAS for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} @@ -68,7 +68,8 @@ c_{\text{new}} &= l_1 \cdot l_2 \\ l_5 &= \tanh(l_3 + l_4) \\ h_{\text{new}} &= \tanh(c_{\text{new}} \cdot l_5) -\end{aligned}\]

Forward

rnncell(inp, [state])
source
RecurrentLayers.RHNCellType
RHNCell((input_size => hidden_size), depth=3;
+\end{aligned}\]

Forward

nascell(inp, (state, cstate))
+nascell(inp)

Arguments

  • inp: The input to the fastrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the NASCell. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros.

Returns

  • A tuple (output, state), where output = new_state is the new hidden state and state = (new_state, new_cstate) is the new hidden and cell state. They are tensors of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.RHNCellType
RHNCell((input_size => hidden_size), depth=3;
     couple_carry::Bool = true,
     cell_kwargs...)

Recurrent highway network. See RHNCellUnit for a the unit component of this layer. See RHN for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • depth: depth of the recurrence. Default is 3
  • couple_carry: couples the carry gate and the transform gate. Default true
  • init_kernel: initializer for the input to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} s_{\ell}^{[t]} &= h_{\ell}^{[t]} \odot t_{\ell}^{[t]} + s_{\ell-1}^{[t]} \odot c_{\ell}^{[t]}, \\ @@ -76,9 +77,9 @@ h_{\ell}^{[t]} &= \tanh(W_h x^{[t]}\mathbb{I}_{\ell = 1} + U_{h_{\ell}} s_{\ell-1}^{[t]} + b_{h_{\ell}}), \\ t_{\ell}^{[t]} &= \sigma(W_t x^{[t]}\mathbb{I}_{\ell = 1} + U_{t_{\ell}} s_{\ell-1}^{[t]} + b_{t_{\ell}}), \\ c_{\ell}^{[t]} &= \sigma(W_c x^{[t]}\mathbb{I}_{\ell = 1} + U_{c_{\ell}} s_{\ell-1}^{[t]} + b_{c_{\ell}}) -\end{aligned}\]

Forward

rnncell(inp, [state])
source
RecurrentLayers.RHNCellUnitType
RHNCellUnit((input_size => hidden_size)::Pair;
+\end{aligned}\]

Forward

rnncell(inp, [state])
source
RecurrentLayers.RHNCellUnitType
RHNCellUnit((input_size => hidden_size)::Pair;
     init_kernel = glorot_uniform,
-    bias = true)
source
RecurrentLayers.MUT1CellType
MUT1Cell((input_size => hidden_size);
+    bias = true)
source
RecurrentLayers.MUT1CellType
MUT1Cell((input_size => hidden_size);
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Mutated unit 1 cell. See MUT1 for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} @@ -86,7 +87,7 @@ r &= \sigma(W_r x_t + U_r h_t + b_r), \\ h_{t+1} &= \tanh(U_h (r \odot h_t) + \tanh(W_h x_t) + b_h) \odot z \\ &\quad + h_t \odot (1 - z). -\end{aligned}\]

Forward

mutcell(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the mutcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the MUTCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.

source
RecurrentLayers.MUT2CellType
MUT2Cell((input_size => hidden_size);
+\end{aligned}\]

Forward

mutcell(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the mutcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the MUTCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.

source
RecurrentLayers.MUT2CellType
MUT2Cell((input_size => hidden_size);
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Mutated unit 2 cell. See MUT2 for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} @@ -94,7 +95,7 @@ r &= \sigma(x_t + U_r h_t + b_r), \\ h_{t+1} &= \tanh(U_h (r \odot h_t) + W_h x_t + b_h) \odot z \\ &\quad + h_t \odot (1 - z). -\end{aligned}\]

Forward

mutcell(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the mutcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the MUTCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.

source
RecurrentLayers.MUT3CellType
MUT3Cell((input_size => hidden_size);
+\end{aligned}\]

Forward

mutcell(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the mutcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the MUTCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.

source
RecurrentLayers.MUT3CellType
MUT3Cell((input_size => hidden_size);
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Mutated unit 3 cell. See MUT3 for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} @@ -102,7 +103,7 @@ r &= \sigma(W_r x_t + U_r h_t + b_r), \\ h_{t+1} &= \tanh(U_h (r \odot h_t) + W_h x_t + b_h) \odot z \\ &\quad + h_t \odot (1 - z). -\end{aligned}\]

Forward

mutcell(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the mutcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the MUTCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.

source
RecurrentLayers.SCRNCellType
SCRNCell((input_size => hidden_size)::Pair;
+\end{aligned}\]

Forward

mutcell(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the mutcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the MUTCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.

source
RecurrentLayers.SCRNCellType
SCRNCell((input_size => hidden_size)::Pair;
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true,
@@ -110,7 +111,7 @@
 s_t &= (1 - \alpha) W_s x_t + \alpha s_{t-1}, \\
 h_t &= \sigma(W_h s_t + U_h h_{t-1} + b_h), \\
 y_t &= f(U_y h_t + W_y s_t)
-\end{aligned}\]

Forward

rnncell(inp, [state, c_state])
source
RecurrentLayers.PeepholeLSTMCellType
PeepholeLSTMCell((input_size => hidden_size)::Pair;
+\end{aligned}\]

Forward

rnncell(inp, [state, c_state])
source
RecurrentLayers.PeepholeLSTMCellType
PeepholeLSTMCell((input_size => hidden_size)::Pair;
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Peephole long short term memory cell. See PeepholeLSTM for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} @@ -119,17 +120,17 @@ o_t &= \sigma_g(W_o x_t + U_o c_{t-1} + b_o), \\ c_t &= f_t \odot c_{t-1} + i_t \odot \sigma_c(W_c x_t + b_c), \\ h_t &= o_t \odot \sigma_h(c_t). -\end{aligned}\]

Forward

lstmcell(x, [h, c])

The forward pass takes the following arguments:

  • x: Input to the cell, which can be a vector of size in or a matrix of size in x batch_size.
  • h: The hidden state vector of the cell, sized out, or a matrix of size out x batch_size.
  • c: The candidate state, sized out, or a matrix of size out x batch_size.

If not provided, both h and c default to vectors of zeros.

Examples

source
RecurrentLayers.FastRNNCellType
FastRNNCell((input_size => hidden_size), [activation];
+\end{aligned}\]

Forward

lstmcell(x, [h, c])

The forward pass takes the following arguments:

  • x: Input to the cell, which can be a vector of size in or a matrix of size in x batch_size.
  • h: The hidden state vector of the cell, sized out, or a matrix of size out x batch_size.
  • c: The candidate state, sized out, or a matrix of size out x batch_size.

If not provided, both h and c default to vectors of zeros.

Examples

source
RecurrentLayers.FastRNNCellType
FastRNNCell((input_size => hidden_size), [activation];
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Fast recurrent neural network cell. See FastRNN for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • activation: the activation function, defaults to tanh_fast
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} \tilde{h}_t &= \sigma(W_h x_t + U_h h_{t-1} + b), \\ h_t &= \alpha \tilde{h}_t + \beta h_{t-1} -\end{aligned}\]

Forward

fastrnncell(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the fastrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the FastRNN. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.

source
RecurrentLayers.FastGRNNCellType
FastGRNNCell((input_size => hidden_size), [activation];
+\end{aligned}\]

Forward

fastrnncell(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the fastrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the FastRNN. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.

source
RecurrentLayers.FastGRNNCellType
FastGRNNCell((input_size => hidden_size), [activation];
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Fast gated recurrent neural network cell. See FastGRNN for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • activation: the activation function, defaults to tanh_fast
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} z_t &= \sigma(W_z x_t + U_z h_{t-1} + b_z), \\ \tilde{h}_t &= \tanh(W_h x_t + U_h h_{t-1} + b_h), \\ h_t &= \big((\zeta (1 - z_t) + \nu) \odot \tilde{h}_t\big) + z_t \odot h_{t-1} -\end{aligned}\]

Forward

fastgrnncell(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the fastgrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the FastGRNN. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.

source
+\end{aligned}\]

Forward

fastgrnncell(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the fastgrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the FastGRNN. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.

source diff --git a/previews/PR27/api/wrappers/index.html b/previews/PR27/api/wrappers/index.html index fe6edca..f0d0d21 100644 --- a/previews/PR27/api/wrappers/index.html +++ b/previews/PR27/api/wrappers/index.html @@ -5,20 +5,20 @@ f_t &= \sigma(W_f x_t + U_f h_{t-1} + b_f), \\ c_t &= i_t \odot \tilde{c}_t + f_t \odot c_{t-1}, \\ h_t &= g(c_t) -\end{aligned}\]

source
RecurrentLayers.IndRNNType
IndRNN((input_size, hidden_size)::Pair, σ = tanh, σ=relu;
-    kwargs...)

Independently recurrent network. See IndRNNCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • σ: activation function. Default is tanh
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\mathbf{h}_{t} = \sigma(\mathbf{W} \mathbf{x}_t + \mathbf{u} \odot \mathbf{h}_{t-1} + \mathbf{b})\]

Forward

indrnn(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the indrnn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the IndRNN. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns new hidden states new_states as an array of size hidden_size x len x batch_size.

source
RecurrentLayers.LightRUType
LightRU((input_size => hidden_size)::Pair; kwargs...)

Light recurrent unit network. See LightRUCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +\end{aligned}\]

source
RecurrentLayers.IndRNNType
IndRNN((input_size, hidden_size)::Pair, σ = tanh, σ=relu;
+    kwargs...)

Independently recurrent network. See IndRNNCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • σ: activation function. Default is tanh
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\mathbf{h}_{t} = \sigma(\mathbf{W} \mathbf{x}_t + \mathbf{u} \odot \mathbf{h}_{t-1} + \mathbf{b})\]

Forward

indrnn(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the indrnn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the IndRNN. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns new hidden states new_states as an array of size hidden_size x len x batch_size.

source
RecurrentLayers.LightRUType
LightRU((input_size => hidden_size)::Pair; kwargs...)

Light recurrent unit network. See LightRUCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} \tilde{h}_t &= \tanh(W_h x_t), \\ f_t &= \delta(W_f x_t + U_f h_{t-1} + b_f), \\ h_t &= (1 - f_t) \odot h_{t-1} + f_t \odot \tilde{h}_t. -\end{aligned}\]

Forward

lightru(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the lightru. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the LightRU. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns new hidden states new_states as an array of size hidden_size x len x batch_size.

source
RecurrentLayers.LiGRUType
LiGRU((input_size => hidden_size)::Pair; kwargs...)

Light gated recurrent network. The implementation does not include the batch normalization as described in the original paper. See LiGRUCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +\end{aligned}\]

Forward

lightru(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the lightru. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the LightRU. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns new hidden states new_states as an array of size hidden_size x len x batch_size.

source
RecurrentLayers.LiGRUType
LiGRU((input_size => hidden_size)::Pair; kwargs...)

Light gated recurrent network. The implementation does not include the batch normalization as described in the original paper. See LiGRUCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} z_t &= \sigma(W_z x_t + U_z h_{t-1}), \\ \tilde{h}_t &= \text{ReLU}(W_h x_t + U_h h_{t-1}), \\ h_t &= z_t \odot h_{t-1} + (1 - z_t) \odot \tilde{h}_t -\end{aligned}\]

Forward

ligru(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the ligru. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the LiGRU. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns new hidden states new_states as an array of size hidden_size x len x batch_size.

source
RecurrentLayers.MGUType
MGU((input_size => hidden_size)::Pair; kwargs...)

Minimal gated unit network. See MGUCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +\end{aligned}\]

Forward

ligru(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the ligru. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the LiGRU. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns new hidden states new_states as an array of size hidden_size x len x batch_size.

source
RecurrentLayers.MGUType
MGU((input_size => hidden_size)::Pair; kwargs...)

Minimal gated unit network. See MGUCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} f_t &= \sigma(W_f x_t + U_f h_{t-1} + b_f), \\ \tilde{h}_t &= \tanh(W_h x_t + U_h (f_t \odot h_{t-1}) + b_h), \\ h_t &= (1 - f_t) \odot h_{t-1} + f_t \odot \tilde{h}_t -\end{aligned}\]

Forward

mgu(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the mgu. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the MGU. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns new hidden states new_states as an array of size hidden_size x len x batch_size.

source
RecurrentLayers.NASType
NAS((input_size => hidden_size)::Pair; kwargs...)

Neural Architecture Search unit. See NASCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +\end{aligned}\]

Forward

mgu(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the mgu. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the MGU. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns new hidden states new_states as an array of size hidden_size x len x batch_size.

source
RecurrentLayers.NASType
NAS((input_size => hidden_size)::Pair; kwargs...)

Neural Architecture Search unit. See NASCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} \text{First Layer Outputs:} & \\ o_1 &= \sigma(W_i^{(1)} x_t + W_h^{(1)} h_{t-1} + b^{(1)}), \\ o_2 &= \text{ReLU}(W_i^{(2)} x_t + W_h^{(2)} h_{t-1} + b^{(2)}), \\ @@ -42,28 +42,28 @@ c_{\text{new}} &= l_1 \cdot l_2 \\ l_5 &= \tanh(l_3 + l_4) \\ h_{\text{new}} &= \tanh(c_{\text{new}} \cdot l_5) -\end{aligned}\]

source
RecurrentLayers.RHNType
RHN((input_size => hidden_size)::Pair depth=3; kwargs...)

Recurrent highway network. See RHNCellUnit for a the unit component of this layer. See RHNCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • depth: depth of the recurrence. Default is 3
  • couple_carry: couples the carry gate and the transform gate. Default true
  • init_kernel: initializer for the input to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +\end{aligned}\]

source
RecurrentLayers.RHNType
RHN((input_size => hidden_size)::Pair depth=3; kwargs...)

Recurrent highway network. See RHNCellUnit for a the unit component of this layer. See RHNCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • depth: depth of the recurrence. Default is 3
  • couple_carry: couples the carry gate and the transform gate. Default true
  • init_kernel: initializer for the input to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} s_{\ell}^{[t]} &= h_{\ell}^{[t]} \odot t_{\ell}^{[t]} + s_{\ell-1}^{[t]} \odot c_{\ell}^{[t]}, \\ \text{where} \\ h_{\ell}^{[t]} &= \tanh(W_h x^{[t]}\mathbb{I}_{\ell = 1} + U_{h_{\ell}} s_{\ell-1}^{[t]} + b_{h_{\ell}}), \\ t_{\ell}^{[t]} &= \sigma(W_t x^{[t]}\mathbb{I}_{\ell = 1} + U_{t_{\ell}} s_{\ell-1}^{[t]} + b_{t_{\ell}}), \\ c_{\ell}^{[t]} &= \sigma(W_c x^{[t]}\mathbb{I}_{\ell = 1} + U_{c_{\ell}} s_{\ell-1}^{[t]} + b_{c_{\ell}}) -\end{aligned}\]

source
RecurrentLayers.MUT1Type
MUT1((input_size => hidden_size); kwargs...)

Mutated unit 1 network. See MUT1Cell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +\end{aligned}\]

source
RecurrentLayers.MUT1Type
MUT1((input_size => hidden_size); kwargs...)

Mutated unit 1 network. See MUT1Cell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} z &= \sigma(W_z x_t + b_z), \\ r &= \sigma(W_r x_t + U_r h_t + b_r), \\ h_{t+1} &= \tanh(U_h (r \odot h_t) + \tanh(W_h x_t) + b_h) \odot z \\ &\quad + h_t \odot (1 - z). -\end{aligned}\]

Forward

mut(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the mut. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the MUT. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns new hidden states new_states as an array of size hidden_size x len x batch_size.

source
RecurrentLayers.MUT2Type
MUT2Cell((input_size => hidden_size); kwargs...)

Mutated unit 2 network. See MUT2Cell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +\end{aligned}\]

Forward

mut(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the mut. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the MUT. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns new hidden states new_states as an array of size hidden_size x len x batch_size.

source
RecurrentLayers.MUT2Type
MUT2Cell((input_size => hidden_size); kwargs...)

Mutated unit 2 network. See MUT2Cell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} z &= \sigma(W_z x_t + U_z h_t + b_z), \\ r &= \sigma(x_t + U_r h_t + b_r), \\ h_{t+1} &= \tanh(U_h (r \odot h_t) + W_h x_t + b_h) \odot z \\ &\quad + h_t \odot (1 - z). -\end{aligned}\]

Forward

mut(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the mut. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the MUT. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns new hidden states new_states as an array of size hidden_size x len x batch_size.

source
RecurrentLayers.MUT3Type
MUT3((input_size => hidden_size); kwargs...)

Mutated unit 3 network. See MUT3Cell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +\end{aligned}\]

Forward

mut(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the mut. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the MUT. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns new hidden states new_states as an array of size hidden_size x len x batch_size.

source
RecurrentLayers.MUT3Type
MUT3((input_size => hidden_size); kwargs...)

Mutated unit 3 network. See MUT3Cell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} z &= \sigma(W_z x_t + U_z \tanh(h_t) + b_z), \\ r &= \sigma(W_r x_t + U_r h_t + b_r), \\ h_{t+1} &= \tanh(U_h (r \odot h_t) + W_h x_t + b_h) \odot z \\ &\quad + h_t \odot (1 - z). -\end{aligned}\]

Forward

mut(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the mut. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the MUT. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns new hidden states new_states as an array of size hidden_size x len x batch_size.

source
RecurrentLayers.SCRNType
SCRN((input_size => hidden_size)::Pair;
+\end{aligned}\]

Forward

mut(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the mut. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the MUT. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns new hidden states new_states as an array of size hidden_size x len x batch_size.

source
RecurrentLayers.SCRNType
SCRN((input_size => hidden_size)::Pair;
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true,
@@ -71,17 +71,17 @@
 s_t &= (1 - \alpha) W_s x_t + \alpha s_{t-1}, \\
 h_t &= \sigma(W_h s_t + U_h h_{t-1} + b_h), \\
 y_t &= f(U_y h_t + W_y s_t)
-\end{aligned}\]

source
RecurrentLayers.PeepholeLSTMType
PeepholeLSTM((input_size => hidden_size)::Pair; kwargs...)

Peephole long short term memory network. See PeepholeLSTMCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{align} +\end{aligned}\]

source
RecurrentLayers.PeepholeLSTMType
PeepholeLSTM((input_size => hidden_size)::Pair; kwargs...)

Peephole long short term memory network. See PeepholeLSTMCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{align} f_t &= \sigma_g(W_f x_t + U_f c_{t-1} + b_f), \\ i_t &= \sigma_g(W_i x_t + U_i c_{t-1} + b_i), \\ o_t &= \sigma_g(W_o x_t + U_o c_{t-1} + b_o), \\ c_t &= f_t \odot c_{t-1} + i_t \odot \sigma_c(W_c x_t + b_c), \\ h_t &= o_t \odot \sigma_h(c_t). -\end{align}\]

source
RecurrentLayers.FastRNNType
FastRNN((input_size => hidden_size), [activation]; kwargs...)

Fast recurrent neural network. See FastRNNCell for a layer that processes a single sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • activation: the activation function, defaults to tanh_fast
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +\end{align}\]

source
RecurrentLayers.FastRNNType
FastRNN((input_size => hidden_size), [activation]; kwargs...)

Fast recurrent neural network. See FastRNNCell for a layer that processes a single sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • activation: the activation function, defaults to tanh_fast
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} \tilde{h}_t &= \sigma(W_h x_t + U_h h_{t-1} + b), \\ h_t &= \alpha \tilde{h}_t + \beta h_{t-1} -\end{aligned}\]

Forward

fastrnn(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the fastrnn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the FastRNN. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns new hidden states new_states as an array of size hidden_size x len x batch_size.

source
RecurrentLayers.FastGRNNType
FastGRNN((input_size => hidden_size), [activation]; kwargs...)

Fast recurrent neural network. See FastGRNNCell for a layer that processes a single sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • activation: the activation function, defaults to tanh_fast
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +\end{aligned}\]

Forward

fastrnn(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the fastrnn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the FastRNN. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns new hidden states new_states as an array of size hidden_size x len x batch_size.

source
RecurrentLayers.FastGRNNType
FastGRNN((input_size => hidden_size), [activation]; kwargs...)

Fast recurrent neural network. See FastGRNNCell for a layer that processes a single sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • activation: the activation function, defaults to tanh_fast
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} z_t &= \sigma(W_z x_t + U_z h_{t-1} + b_z), \\ \tilde{h}_t &= \tanh(W_h x_t + U_h h_{t-1} + b_h), \\ h_t &= \big((\zeta (1 - z_t) + \nu) \odot \tilde{h}_t\big) + z_t \odot h_{t-1} -\end{aligned}\]

Forward

fastgrnn(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the fastgrnn. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the FastGRNN. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.

source
+\end{aligned}\]

Forward

fastgrnn(inp, [state])

The arguments of the forward pass are:

  • inp: The input to the fastgrnn. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the FastGRNN. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.

source diff --git a/previews/PR27/index.html b/previews/PR27/index.html index 3078c10..a02f870 100644 --- a/previews/PR27/index.html +++ b/previews/PR27/index.html @@ -1,2 +1,2 @@ -Home · RecurrentLayers.jl

RecurrentLayers

RecurrentLayers.jl extends Flux.jl recurrent layers offering by providing implementations of bleeding edge recurrent layers not commonly available in base deep learning libraries. It is designed for a seamless integration with the larger Flux ecosystem, enabling researchers and practitioners to leverage the latest developments in recurrent neural networks.

Implemented layers

  • Minimal gated unit as MGUCell arxiv
  • Light gated recurrent unit as LiGRUCell arxiv
  • Independently recurrent neural networks as IndRNNCell arxiv
  • Recurrent addictive networks as RANCell arxiv
  • Recurrent highway network as RHNCell arixv
  • Light recurrent unit as LightRUCell pub
  • Neural architecture search unit NASCell arxiv
  • Evolving recurrent neural networks as MUT1Cell, MUT2Cell, MUT3Cell pub
  • Structurally constrained recurrent neural network as SCRNCell arxiv
  • Peephole long short term memory as PeepholeLSTMCell pub
  • FastRNNCell and FastGRNNCell arxiv

Contributing

Contributions are always welcome! We specifically look for :

  • Recurrent cells you would like to see implemented
  • Benchmarks
  • Any bugs and mistakes of course!
  • Documentation, in any form: examples, how tos, docstrings
+Home · RecurrentLayers.jl

RecurrentLayers

RecurrentLayers.jl extends Flux.jl recurrent layers offering by providing implementations of bleeding edge recurrent layers not commonly available in base deep learning libraries. It is designed for a seamless integration with the larger Flux ecosystem, enabling researchers and practitioners to leverage the latest developments in recurrent neural networks.

Implemented layers

  • Minimal gated unit as MGUCell arxiv
  • Light gated recurrent unit as LiGRUCell arxiv
  • Independently recurrent neural networks as IndRNNCell arxiv
  • Recurrent addictive networks as RANCell arxiv
  • Recurrent highway network as RHNCell arixv
  • Light recurrent unit as LightRUCell pub
  • Neural architecture search unit NASCell arxiv
  • Evolving recurrent neural networks as MUT1Cell, MUT2Cell, MUT3Cell pub
  • Structurally constrained recurrent neural network as SCRNCell arxiv
  • Peephole long short term memory as PeepholeLSTMCell pub
  • FastRNNCell and FastGRNNCell arxiv

Contributing

Contributions are always welcome! We specifically look for :

  • Recurrent cells you would like to see implemented
  • Benchmarks
  • Any bugs and mistakes of course!
  • Documentation, in any form: examples, how tos, docstrings
diff --git a/previews/PR27/roadmap/index.html b/previews/PR27/roadmap/index.html index 5848bd9..13e000a 100644 --- a/previews/PR27/roadmap/index.html +++ b/previews/PR27/roadmap/index.html @@ -1,2 +1,2 @@ -Roadmap · RecurrentLayers.jl

Roadmap

This page documents some planned work for RecurrentLayers.jl. Future work for this library includes additional cells such as:

  • FastRNNs and FastGRUs (current focus) arxiv
  • Unitary recurrent neural networks arxiv
  • Modern recurrent neural networks such as LRU and minLSTM/minGRU
  • Quasi recurrent neural networks arxiv

Additionally, some cell-independent architectures are also planned, that expand the ability of recurrent architectures and could theoretically take any cell:

An implementation of these ideally would be, for example FastSlow(RNNCell, input_size => hidden_size). More details on this soon!

+Roadmap · RecurrentLayers.jl

Roadmap

This page documents some planned work for RecurrentLayers.jl. Future work for this library includes additional cells such as:

  • FastRNNs and FastGRUs (current focus) arxiv
  • Unitary recurrent neural networks arxiv
  • Modern recurrent neural networks such as LRU and minLSTM/minGRU
  • Quasi recurrent neural networks arxiv

Additionally, some cell-independent architectures are also planned, that expand the ability of recurrent architectures and could theoretically take any cell:

An implementation of these ideally would be, for example FastSlow(RNNCell, input_size => hidden_size). More details on this soon!

diff --git a/previews/PR27/search_index.js b/previews/PR27/search_index.js index 89933c9..b956011 100644 --- a/previews/PR27/search_index.js +++ b/previews/PR27/search_index.js @@ -1,3 +1,3 @@ var documenterSearchIndex = {"docs": -[{"location":"api/cells/#Cells","page":"Cells","title":"Cells","text":"","category":"section"},{"location":"api/cells/","page":"Cells","title":"Cells","text":"RANCell\nIndRNNCell\nLightRUCell\nLiGRUCell\nMGUCell\nNASCell\nRHNCell\nRHNCellUnit\nMUT1Cell\nMUT2Cell\nMUT3Cell\nSCRNCell\nPeepholeLSTMCell\nFastRNNCell\nFastGRNNCell","category":"page"},{"location":"api/cells/#RecurrentLayers.RANCell","page":"Cells","title":"RecurrentLayers.RANCell","text":"RANCell((input_size => hidden_size)::Pair;\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nThe RANCell, introduced in this paper, is a recurrent cell layer which provides additional memory through the use of gates.\n\nand returns both ht anf ct.\n\nSee RAN for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ntildec_t = W_c x_t \ni_t = sigma(W_i x_t + U_i h_t-1 + b_i) \nf_t = sigma(W_f x_t + U_f h_t-1 + b_f) \nc_t = i_t odot tildec_t + f_t odot c_t-1 \nh_t = g(c_t)\nendaligned\n\nForward\n\nrancell(x, [h, c])\n\nThe forward pass takes the following arguments:\n\nx: Input to the cell, which can be a vector of size in or a matrix of size in x batch_size.\nh: The hidden state vector of the cell, sized out, or a matrix of size out x batch_size.\nc: The candidate state, sized out, or a matrix of size out x batch_size.\n\nIf not provided, both h and c default to vectors of zeros.\n\nExamples\n\nrancell = RANCell(3 => 5)\ninp = rand(Float32, 3)\n#initializing the hidden states, if we want to provide them\nstate = rand(Float32, 5)\nc_state = rand(Float32, 5)\n\n#result with default initialization of internal states\nresult = rancell(inp)\n#result with internal states provided\nresult_state = rancell(inp, (state, c_state))\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.IndRNNCell","page":"Cells","title":"RecurrentLayers.IndRNNCell","text":"IndRNNCell((input_size => hidden_size)::Pair, σ=relu;\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nIndependently recurrent cell. See IndRNN for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\nσ: activation function. Default is tanh\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nmathbfh_t = sigma(mathbfW mathbfx_t + mathbfu odot mathbfh_t-1 + mathbfb)\n\nForward\n\nindrnncell(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the indrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the IndRNNCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.LightRUCell","page":"Cells","title":"RecurrentLayers.LightRUCell","text":"LightRUCell((input_size => hidden_size)::Pair;\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nLight recurrent unit. See LightRU for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ntildeh_t = tanh(W_h x_t) \nf_t = delta(W_f x_t + U_f h_t-1 + b_f) \nh_t = (1 - f_t) odot h_t-1 + f_t odot tildeh_t\nendaligned\n\nForward\n\nlightrucell(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the lightrucell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the LightRUCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.LiGRUCell","page":"Cells","title":"RecurrentLayers.LiGRUCell","text":"LiGRUCell((input_size => hidden_size)::Pair;\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nLight gated recurrent unit. The implementation does not include the batch normalization as described in the original paper. See LiGRU for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz_t = sigma(W_z x_t + U_z h_t-1) \ntildeh_t = textReLU(W_h x_t + U_h h_t-1) \nh_t = z_t odot h_t-1 + (1 - z_t) odot tildeh_t\nendaligned\n\nForward\n\nligrucell(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the ligrucell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the LiGRUCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.MGUCell","page":"Cells","title":"RecurrentLayers.MGUCell","text":"MGUCell((input_size => hidden_size)::Pair;\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nMinimal gated unit. See MGU for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nf_t = sigma(W_f x_t + U_f h_t-1 + b_f) \ntildeh_t = tanh(W_h x_t + U_h (f_t odot h_t-1) + b_h) \nh_t = (1 - f_t) odot h_t-1 + f_t odot tildeh_t\nendaligned\n\nForward\n\nmgucell(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the mgucell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the MGUCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.NASCell","page":"Cells","title":"RecurrentLayers.NASCell","text":"NASCell((input_size => hidden_size);\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nNeural Architecture Search unit. See NAS for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ntextFirst Layer Outputs \no_1 = sigma(W_i^(1) x_t + W_h^(1) h_t-1 + b^(1)) \no_2 = textReLU(W_i^(2) x_t + W_h^(2) h_t-1 + b^(2)) \no_3 = sigma(W_i^(3) x_t + W_h^(3) h_t-1 + b^(3)) \no_4 = textReLU(W_i^(4) x_t cdot W_h^(4) h_t-1) \no_5 = tanh(W_i^(5) x_t + W_h^(5) h_t-1 + b^(5)) \no_6 = sigma(W_i^(6) x_t + W_h^(6) h_t-1 + b^(6)) \no_7 = tanh(W_i^(7) x_t + W_h^(7) h_t-1 + b^(7)) \no_8 = sigma(W_i^(8) x_t + W_h^(8) h_t-1 + b^(8)) \n\ntextSecond Layer Computations \nl_1 = tanh(o_1 cdot o_2) \nl_2 = tanh(o_3 + o_4) \nl_3 = tanh(o_5 cdot o_6) \nl_4 = sigma(o_7 + o_8) \n\ntextInject Cell State \nl_1 = tanh(l_1 + c_textstate) \n\ntextFinal Layer Computations \nc_textnew = l_1 cdot l_2 \nl_5 = tanh(l_3 + l_4) \nh_textnew = tanh(c_textnew cdot l_5)\nendaligned\n\nForward\n\nrnncell(inp, [state])\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.RHNCell","page":"Cells","title":"RecurrentLayers.RHNCell","text":"RHNCell((input_size => hidden_size), depth=3;\n couple_carry::Bool = true,\n cell_kwargs...)\n\nRecurrent highway network. See RHNCellUnit for a the unit component of this layer. See RHN for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ndepth: depth of the recurrence. Default is 3\ncouple_carry: couples the carry gate and the transform gate. Default true\ninit_kernel: initializer for the input to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ns_ell^t = h_ell^t odot t_ell^t + s_ell-1^t odot c_ell^t \ntextwhere \nh_ell^t = tanh(W_h x^tmathbbI_ell = 1 + U_h_ell s_ell-1^t + b_h_ell) \nt_ell^t = sigma(W_t x^tmathbbI_ell = 1 + U_t_ell s_ell-1^t + b_t_ell) \nc_ell^t = sigma(W_c x^tmathbbI_ell = 1 + U_c_ell s_ell-1^t + b_c_ell)\nendaligned\n\nForward\n\nrnncell(inp, [state])\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.RHNCellUnit","page":"Cells","title":"RecurrentLayers.RHNCellUnit","text":"RHNCellUnit((input_size => hidden_size)::Pair;\n init_kernel = glorot_uniform,\n bias = true)\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.MUT1Cell","page":"Cells","title":"RecurrentLayers.MUT1Cell","text":"MUT1Cell((input_size => hidden_size);\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nMutated unit 1 cell. See MUT1 for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz = sigma(W_z x_t + b_z) \nr = sigma(W_r x_t + U_r h_t + b_r) \nh_t+1 = tanh(U_h (r odot h_t) + tanh(W_h x_t) + b_h) odot z \nquad + h_t odot (1 - z)\nendaligned\n\nForward\n\nmutcell(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the mutcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the MUTCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.MUT2Cell","page":"Cells","title":"RecurrentLayers.MUT2Cell","text":"MUT2Cell((input_size => hidden_size);\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nMutated unit 2 cell. See MUT2 for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz = sigma(W_z x_t + U_z h_t + b_z) \nr = sigma(x_t + U_r h_t + b_r) \nh_t+1 = tanh(U_h (r odot h_t) + W_h x_t + b_h) odot z \nquad + h_t odot (1 - z)\nendaligned\n\nForward\n\nmutcell(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the mutcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the MUTCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.MUT3Cell","page":"Cells","title":"RecurrentLayers.MUT3Cell","text":"MUT3Cell((input_size => hidden_size);\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nMutated unit 3 cell. See MUT3 for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz = sigma(W_z x_t + U_z tanh(h_t) + b_z) \nr = sigma(W_r x_t + U_r h_t + b_r) \nh_t+1 = tanh(U_h (r odot h_t) + W_h x_t + b_h) odot z \nquad + h_t odot (1 - z)\nendaligned\n\nForward\n\nmutcell(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the mutcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the MUTCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.SCRNCell","page":"Cells","title":"RecurrentLayers.SCRNCell","text":"SCRNCell((input_size => hidden_size)::Pair;\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true,\n alpha = 0.0)\n\nStructurally contraint recurrent unit. See SCRN for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\nalpha: structural contraint. Default is 0.0\n\nEquations\n\nbeginaligned\ns_t = (1 - alpha) W_s x_t + alpha s_t-1 \nh_t = sigma(W_h s_t + U_h h_t-1 + b_h) \ny_t = f(U_y h_t + W_y s_t)\nendaligned\n\nForward\n\nrnncell(inp, [state, c_state])\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.PeepholeLSTMCell","page":"Cells","title":"RecurrentLayers.PeepholeLSTMCell","text":"PeepholeLSTMCell((input_size => hidden_size)::Pair;\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nPeephole long short term memory cell. See PeepholeLSTM for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nf_t = sigma_g(W_f x_t + U_f c_t-1 + b_f) \ni_t = sigma_g(W_i x_t + U_i c_t-1 + b_i) \no_t = sigma_g(W_o x_t + U_o c_t-1 + b_o) \nc_t = f_t odot c_t-1 + i_t odot sigma_c(W_c x_t + b_c) \nh_t = o_t odot sigma_h(c_t)\nendaligned\n\nForward\n\nlstmcell(x, [h, c])\n\nThe forward pass takes the following arguments:\n\nx: Input to the cell, which can be a vector of size in or a matrix of size in x batch_size.\nh: The hidden state vector of the cell, sized out, or a matrix of size out x batch_size.\nc: The candidate state, sized out, or a matrix of size out x batch_size.\n\nIf not provided, both h and c default to vectors of zeros.\n\nExamples\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.FastRNNCell","page":"Cells","title":"RecurrentLayers.FastRNNCell","text":"FastRNNCell((input_size => hidden_size), [activation];\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nFast recurrent neural network cell. See FastRNN for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\nactivation: the activation function, defaults to tanh_fast\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ntildeh_t = sigma(W_h x_t + U_h h_t-1 + b) \nh_t = alpha tildeh_t + beta h_t-1\nendaligned\n\nForward\n\nfastrnncell(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the fastrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the FastRNN. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.FastGRNNCell","page":"Cells","title":"RecurrentLayers.FastGRNNCell","text":"FastGRNNCell((input_size => hidden_size), [activation];\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nFast gated recurrent neural network cell. See FastGRNN for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\nactivation: the activation function, defaults to tanh_fast\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz_t = sigma(W_z x_t + U_z h_t-1 + b_z) \ntildeh_t = tanh(W_h x_t + U_h h_t-1 + b_h) \nh_t = big((zeta (1 - z_t) + nu) odot tildeh_tbig) + z_t odot h_t-1\nendaligned\n\nForward\n\nfastgrnncell(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the fastgrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the FastGRNN. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"roadmap/#Roadmap","page":"Roadmap","title":"Roadmap","text":"","category":"section"},{"location":"roadmap/","page":"Roadmap","title":"Roadmap","text":"This page documents some planned work for RecurrentLayers.jl. Future work for this library includes additional cells such as:","category":"page"},{"location":"roadmap/","page":"Roadmap","title":"Roadmap","text":"FastRNNs and FastGRUs (current focus) arxiv\nUnitary recurrent neural networks arxiv\nModern recurrent neural networks such as LRU and minLSTM/minGRU\nQuasi recurrent neural networks arxiv","category":"page"},{"location":"roadmap/","page":"Roadmap","title":"Roadmap","text":"Additionally, some cell-independent architectures are also planned, that expand the ability of recurrent architectures and could theoretically take any cell:","category":"page"},{"location":"roadmap/","page":"Roadmap","title":"Roadmap","text":"Clockwork rnns arxiv\nPhased rnns arxiv\nSegment rnn arxiv\nFast-Slow rnns arxiv","category":"page"},{"location":"roadmap/","page":"Roadmap","title":"Roadmap","text":"An implementation of these ideally would be, for example FastSlow(RNNCell, input_size => hidden_size). More details on this soon!","category":"page"},{"location":"api/wrappers/#Cell-wrappers","page":"Cell Wrappers","title":"Cell wrappers","text":"","category":"section"},{"location":"api/wrappers/","page":"Cell Wrappers","title":"Cell Wrappers","text":"RAN\nIndRNN\nLightRU\nLiGRU\nMGU\nNAS\nRHN\nMUT1\nMUT2\nMUT3\nSCRN\nPeepholeLSTM\nFastRNN\nFastGRNN","category":"page"},{"location":"api/wrappers/#RecurrentLayers.RAN","page":"Cell Wrappers","title":"RecurrentLayers.RAN","text":"RAN(input_size => hidden_size; kwargs...)\n\nThe RANCell, introduced in this paper, is a recurrent cell layer which provides additional memory through the use of gates.\n\nand returns both ht anf ct.\n\nSee RANCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ntildec_t = W_c x_t \ni_t = sigma(W_i x_t + U_i h_t-1 + b_i) \nf_t = sigma(W_f x_t + U_f h_t-1 + b_f) \nc_t = i_t odot tildec_t + f_t odot c_t-1 \nh_t = g(c_t)\nendaligned\n\n\n\n\n\n","category":"type"},{"location":"api/wrappers/#RecurrentLayers.IndRNN","page":"Cell Wrappers","title":"RecurrentLayers.IndRNN","text":"IndRNN((input_size, hidden_size)::Pair, σ = tanh, σ=relu;\n kwargs...)\n\nIndependently recurrent network. See IndRNNCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\nσ: activation function. Default is tanh\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nmathbfh_t = sigma(mathbfW mathbfx_t + mathbfu odot mathbfh_t-1 + mathbfb)\n\nForward\n\nindrnn(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the indrnn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\nstate: The hidden state of the IndRNN. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns new hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/wrappers/#RecurrentLayers.LightRU","page":"Cell Wrappers","title":"RecurrentLayers.LightRU","text":"LightRU((input_size => hidden_size)::Pair; kwargs...)\n\nLight recurrent unit network. See LightRUCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ntildeh_t = tanh(W_h x_t) \nf_t = delta(W_f x_t + U_f h_t-1 + b_f) \nh_t = (1 - f_t) odot h_t-1 + f_t odot tildeh_t\nendaligned\n\nForward\n\nlightru(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the lightru. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\nstate: The hidden state of the LightRU. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns new hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/wrappers/#RecurrentLayers.LiGRU","page":"Cell Wrappers","title":"RecurrentLayers.LiGRU","text":"LiGRU((input_size => hidden_size)::Pair; kwargs...)\n\nLight gated recurrent network. The implementation does not include the batch normalization as described in the original paper. See LiGRUCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz_t = sigma(W_z x_t + U_z h_t-1) \ntildeh_t = textReLU(W_h x_t + U_h h_t-1) \nh_t = z_t odot h_t-1 + (1 - z_t) odot tildeh_t\nendaligned\n\nForward\n\nligru(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the ligru. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\nstate: The hidden state of the LiGRU. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns new hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/wrappers/#RecurrentLayers.MGU","page":"Cell Wrappers","title":"RecurrentLayers.MGU","text":"MGU((input_size => hidden_size)::Pair; kwargs...)\n\nMinimal gated unit network. See MGUCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nf_t = sigma(W_f x_t + U_f h_t-1 + b_f) \ntildeh_t = tanh(W_h x_t + U_h (f_t odot h_t-1) + b_h) \nh_t = (1 - f_t) odot h_t-1 + f_t odot tildeh_t\nendaligned\n\nForward\n\nmgu(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the mgu. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\nstate: The hidden state of the MGU. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns new hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/wrappers/#RecurrentLayers.NAS","page":"Cell Wrappers","title":"RecurrentLayers.NAS","text":"NAS((input_size => hidden_size)::Pair; kwargs...)\n\nNeural Architecture Search unit. See NASCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ntextFirst Layer Outputs \no_1 = sigma(W_i^(1) x_t + W_h^(1) h_t-1 + b^(1)) \no_2 = textReLU(W_i^(2) x_t + W_h^(2) h_t-1 + b^(2)) \no_3 = sigma(W_i^(3) x_t + W_h^(3) h_t-1 + b^(3)) \no_4 = textReLU(W_i^(4) x_t cdot W_h^(4) h_t-1) \no_5 = tanh(W_i^(5) x_t + W_h^(5) h_t-1 + b^(5)) \no_6 = sigma(W_i^(6) x_t + W_h^(6) h_t-1 + b^(6)) \no_7 = tanh(W_i^(7) x_t + W_h^(7) h_t-1 + b^(7)) \no_8 = sigma(W_i^(8) x_t + W_h^(8) h_t-1 + b^(8)) \n\ntextSecond Layer Computations \nl_1 = tanh(o_1 cdot o_2) \nl_2 = tanh(o_3 + o_4) \nl_3 = tanh(o_5 cdot o_6) \nl_4 = sigma(o_7 + o_8) \n\ntextInject Cell State \nl_1 = tanh(l_1 + c_textstate) \n\ntextFinal Layer Computations \nc_textnew = l_1 cdot l_2 \nl_5 = tanh(l_3 + l_4) \nh_textnew = tanh(c_textnew cdot l_5)\nendaligned\n\n\n\n\n\n","category":"type"},{"location":"api/wrappers/#RecurrentLayers.RHN","page":"Cell Wrappers","title":"RecurrentLayers.RHN","text":"RHN((input_size => hidden_size)::Pair depth=3; kwargs...)\n\nRecurrent highway network. See RHNCellUnit for a the unit component of this layer. See RHNCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ndepth: depth of the recurrence. Default is 3\ncouple_carry: couples the carry gate and the transform gate. Default true\ninit_kernel: initializer for the input to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ns_ell^t = h_ell^t odot t_ell^t + s_ell-1^t odot c_ell^t \ntextwhere \nh_ell^t = tanh(W_h x^tmathbbI_ell = 1 + U_h_ell s_ell-1^t + b_h_ell) \nt_ell^t = sigma(W_t x^tmathbbI_ell = 1 + U_t_ell s_ell-1^t + b_t_ell) \nc_ell^t = sigma(W_c x^tmathbbI_ell = 1 + U_c_ell s_ell-1^t + b_c_ell)\nendaligned\n\n\n\n\n\n","category":"type"},{"location":"api/wrappers/#RecurrentLayers.MUT1","page":"Cell Wrappers","title":"RecurrentLayers.MUT1","text":"MUT1((input_size => hidden_size); kwargs...)\n\nMutated unit 1 network. See MUT1Cell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz = sigma(W_z x_t + b_z) \nr = sigma(W_r x_t + U_r h_t + b_r) \nh_t+1 = tanh(U_h (r odot h_t) + tanh(W_h x_t) + b_h) odot z \nquad + h_t odot (1 - z)\nendaligned\n\nForward\n\nmut(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the mut. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\nstate: The hidden state of the MUT. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns new hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/wrappers/#RecurrentLayers.MUT2","page":"Cell Wrappers","title":"RecurrentLayers.MUT2","text":"MUT2Cell((input_size => hidden_size); kwargs...)\n\nMutated unit 2 network. See MUT2Cell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz = sigma(W_z x_t + U_z h_t + b_z) \nr = sigma(x_t + U_r h_t + b_r) \nh_t+1 = tanh(U_h (r odot h_t) + W_h x_t + b_h) odot z \nquad + h_t odot (1 - z)\nendaligned\n\nForward\n\nmut(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the mut. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\nstate: The hidden state of the MUT. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns new hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/wrappers/#RecurrentLayers.MUT3","page":"Cell Wrappers","title":"RecurrentLayers.MUT3","text":"MUT3((input_size => hidden_size); kwargs...)\n\nMutated unit 3 network. See MUT3Cell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz = sigma(W_z x_t + U_z tanh(h_t) + b_z) \nr = sigma(W_r x_t + U_r h_t + b_r) \nh_t+1 = tanh(U_h (r odot h_t) + W_h x_t + b_h) odot z \nquad + h_t odot (1 - z)\nendaligned\n\nForward\n\nmut(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the mut. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\nstate: The hidden state of the MUT. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns new hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/wrappers/#RecurrentLayers.SCRN","page":"Cell Wrappers","title":"RecurrentLayers.SCRN","text":"SCRN((input_size => hidden_size)::Pair;\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true,\n alpha = 0.0)\n\nStructurally contraint recurrent unit. See SCRNCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\nalpha: structural contraint. Default is 0.0\n\nEquations\n\nbeginaligned\ns_t = (1 - alpha) W_s x_t + alpha s_t-1 \nh_t = sigma(W_h s_t + U_h h_t-1 + b_h) \ny_t = f(U_y h_t + W_y s_t)\nendaligned\n\n\n\n\n\n","category":"type"},{"location":"api/wrappers/#RecurrentLayers.PeepholeLSTM","page":"Cell Wrappers","title":"RecurrentLayers.PeepholeLSTM","text":"PeepholeLSTM((input_size => hidden_size)::Pair; kwargs...)\n\nPeephole long short term memory network. See PeepholeLSTMCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginalign\nf_t = sigma_g(W_f x_t + U_f c_t-1 + b_f) \ni_t = sigma_g(W_i x_t + U_i c_t-1 + b_i) \no_t = sigma_g(W_o x_t + U_o c_t-1 + b_o) \nc_t = f_t odot c_t-1 + i_t odot sigma_c(W_c x_t + b_c) \nh_t = o_t odot sigma_h(c_t)\nendalign\n\n\n\n\n\n","category":"type"},{"location":"api/wrappers/#RecurrentLayers.FastRNN","page":"Cell Wrappers","title":"RecurrentLayers.FastRNN","text":"FastRNN((input_size => hidden_size), [activation]; kwargs...)\n\nFast recurrent neural network. See FastRNNCell for a layer that processes a single sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\nactivation: the activation function, defaults to tanh_fast\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ntildeh_t = sigma(W_h x_t + U_h h_t-1 + b) \nh_t = alpha tildeh_t + beta h_t-1\nendaligned\n\nForward\n\nfastrnn(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the fastrnn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\nstate: The hidden state of the FastRNN. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns new hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/wrappers/#RecurrentLayers.FastGRNN","page":"Cell Wrappers","title":"RecurrentLayers.FastGRNN","text":"FastGRNN((input_size => hidden_size), [activation]; kwargs...)\n\nFast recurrent neural network. See FastGRNNCell for a layer that processes a single sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\nactivation: the activation function, defaults to tanh_fast\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz_t = sigma(W_z x_t + U_z h_t-1 + b_z) \ntildeh_t = tanh(W_h x_t + U_h h_t-1 + b_h) \nh_t = big((zeta (1 - z_t) + nu) odot tildeh_tbig) + z_t odot h_t-1\nendaligned\n\nForward\n\nfastgrnn(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the fastgrnn. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the FastGRNN. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"","page":"Home","title":"Home","text":"CurrentModule = RecurrentLayers","category":"page"},{"location":"#RecurrentLayers","page":"Home","title":"RecurrentLayers","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"RecurrentLayers.jl extends Flux.jl recurrent layers offering by providing implementations of bleeding edge recurrent layers not commonly available in base deep learning libraries. It is designed for a seamless integration with the larger Flux ecosystem, enabling researchers and practitioners to leverage the latest developments in recurrent neural networks.","category":"page"},{"location":"#Implemented-layers","page":"Home","title":"Implemented layers","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Minimal gated unit as MGUCell arxiv\nLight gated recurrent unit as LiGRUCell arxiv\nIndependently recurrent neural networks as IndRNNCell arxiv\nRecurrent addictive networks as RANCell arxiv\nRecurrent highway network as RHNCell arixv\nLight recurrent unit as LightRUCell pub\nNeural architecture search unit NASCell arxiv\nEvolving recurrent neural networks as MUT1Cell, MUT2Cell, MUT3Cell pub\nStructurally constrained recurrent neural network as SCRNCell arxiv\nPeephole long short term memory as PeepholeLSTMCell pub\nFastRNNCell and FastGRNNCell arxiv","category":"page"},{"location":"#Contributing","page":"Home","title":"Contributing","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Contributions are always welcome! We specifically look for :","category":"page"},{"location":"","page":"Home","title":"Home","text":"Recurrent cells you would like to see implemented \nBenchmarks\nAny bugs and mistakes of course!\nDocumentation, in any form: examples, how tos, docstrings ","category":"page"}] +[{"location":"api/cells/#Cells","page":"Cells","title":"Cells","text":"","category":"section"},{"location":"api/cells/","page":"Cells","title":"Cells","text":"RANCell\nIndRNNCell\nLightRUCell\nLiGRUCell\nMGUCell\nNASCell\nRHNCell\nRHNCellUnit\nMUT1Cell\nMUT2Cell\nMUT3Cell\nSCRNCell\nPeepholeLSTMCell\nFastRNNCell\nFastGRNNCell","category":"page"},{"location":"api/cells/#RecurrentLayers.RANCell","page":"Cells","title":"RecurrentLayers.RANCell","text":"RANCell((input_size => hidden_size)::Pair;\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nThe RANCell, introduced in this paper, is a recurrent cell layer which provides additional memory through the use of gates.\n\nand returns both ht anf ct.\n\nSee RAN for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ntildec_t = W_c x_t \ni_t = sigma(W_i x_t + U_i h_t-1 + b_i) \nf_t = sigma(W_f x_t + U_f h_t-1 + b_f) \nc_t = i_t odot tildec_t + f_t odot c_t-1 \nh_t = g(c_t)\nendaligned\n\nForward\n\nrancell(x, [h, c])\n\nThe forward pass takes the following arguments:\n\nx: Input to the cell, which can be a vector of size in or a matrix of size in x batch_size.\nh: The hidden state vector of the cell, sized out, or a matrix of size out x batch_size.\nc: The candidate state, sized out, or a matrix of size out x batch_size.\n\nIf not provided, both h and c default to vectors of zeros.\n\nExamples\n\nrancell = RANCell(3 => 5)\ninp = rand(Float32, 3)\n#initializing the hidden states, if we want to provide them\nstate = rand(Float32, 5)\nc_state = rand(Float32, 5)\n\n#result with default initialization of internal states\nresult = rancell(inp)\n#result with internal states provided\nresult_state = rancell(inp, (state, c_state))\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.IndRNNCell","page":"Cells","title":"RecurrentLayers.IndRNNCell","text":"IndRNNCell((input_size => hidden_size)::Pair, σ=relu;\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nIndependently recurrent cell. See IndRNN for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\nσ: activation function. Default is tanh\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nmathbfh_t = sigma(mathbfW mathbfx_t + mathbfu odot mathbfh_t-1 + mathbfb)\n\nForward\n\nindrnncell(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the indrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the IndRNNCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.LightRUCell","page":"Cells","title":"RecurrentLayers.LightRUCell","text":"LightRUCell((input_size => hidden_size)::Pair;\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nLight recurrent unit. See LightRU for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ntildeh_t = tanh(W_h x_t) \nf_t = delta(W_f x_t + U_f h_t-1 + b_f) \nh_t = (1 - f_t) odot h_t-1 + f_t odot tildeh_t\nendaligned\n\nForward\n\nlightrucell(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the lightrucell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the LightRUCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.LiGRUCell","page":"Cells","title":"RecurrentLayers.LiGRUCell","text":"LiGRUCell((input_size => hidden_size)::Pair;\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nLight gated recurrent unit. The implementation does not include the batch normalization as described in the original paper. See LiGRU for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz_t = sigma(W_z x_t + U_z h_t-1) \ntildeh_t = textReLU(W_h x_t + U_h h_t-1) \nh_t = z_t odot h_t-1 + (1 - z_t) odot tildeh_t\nendaligned\n\nForward\n\nligrucell(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the ligrucell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the LiGRUCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.MGUCell","page":"Cells","title":"RecurrentLayers.MGUCell","text":"MGUCell((input_size => hidden_size)::Pair;\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nMinimal gated unit. See MGU for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nf_t = sigma(W_f x_t + U_f h_t-1 + b_f) \ntildeh_t = tanh(W_h x_t + U_h (f_t odot h_t-1) + b_h) \nh_t = (1 - f_t) odot h_t-1 + f_t odot tildeh_t\nendaligned\n\nForward\n\nmgucell(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the mgucell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the MGUCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.NASCell","page":"Cells","title":"RecurrentLayers.NASCell","text":"NASCell((input_size => hidden_size);\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nNeural Architecture Search unit. See NAS for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ntextFirst Layer Outputs \no_1 = sigma(W_i^(1) x_t + W_h^(1) h_t-1 + b^(1)) \no_2 = textReLU(W_i^(2) x_t + W_h^(2) h_t-1 + b^(2)) \no_3 = sigma(W_i^(3) x_t + W_h^(3) h_t-1 + b^(3)) \no_4 = textReLU(W_i^(4) x_t cdot W_h^(4) h_t-1) \no_5 = tanh(W_i^(5) x_t + W_h^(5) h_t-1 + b^(5)) \no_6 = sigma(W_i^(6) x_t + W_h^(6) h_t-1 + b^(6)) \no_7 = tanh(W_i^(7) x_t + W_h^(7) h_t-1 + b^(7)) \no_8 = sigma(W_i^(8) x_t + W_h^(8) h_t-1 + b^(8)) \n\ntextSecond Layer Computations \nl_1 = tanh(o_1 cdot o_2) \nl_2 = tanh(o_3 + o_4) \nl_3 = tanh(o_5 cdot o_6) \nl_4 = sigma(o_7 + o_8) \n\ntextInject Cell State \nl_1 = tanh(l_1 + c_textstate) \n\ntextFinal Layer Computations \nc_textnew = l_1 cdot l_2 \nl_5 = tanh(l_3 + l_4) \nh_textnew = tanh(c_textnew cdot l_5)\nendaligned\n\nForward\n\nnascell(inp, (state, cstate))\nnascell(inp)\n\nArguments\n\ninp: The input to the fastrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\n(state, cstate): A tuple containing the hidden and cell states of the NASCell. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros.\n\nReturns\n\nA tuple (output, state), where output = new_state is the new hidden state and state = (new_state, new_cstate) is the new hidden and cell state. They are tensors of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.RHNCell","page":"Cells","title":"RecurrentLayers.RHNCell","text":"RHNCell((input_size => hidden_size), depth=3;\n couple_carry::Bool = true,\n cell_kwargs...)\n\nRecurrent highway network. See RHNCellUnit for a the unit component of this layer. See RHN for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ndepth: depth of the recurrence. Default is 3\ncouple_carry: couples the carry gate and the transform gate. Default true\ninit_kernel: initializer for the input to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ns_ell^t = h_ell^t odot t_ell^t + s_ell-1^t odot c_ell^t \ntextwhere \nh_ell^t = tanh(W_h x^tmathbbI_ell = 1 + U_h_ell s_ell-1^t + b_h_ell) \nt_ell^t = sigma(W_t x^tmathbbI_ell = 1 + U_t_ell s_ell-1^t + b_t_ell) \nc_ell^t = sigma(W_c x^tmathbbI_ell = 1 + U_c_ell s_ell-1^t + b_c_ell)\nendaligned\n\nForward\n\nrnncell(inp, [state])\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.RHNCellUnit","page":"Cells","title":"RecurrentLayers.RHNCellUnit","text":"RHNCellUnit((input_size => hidden_size)::Pair;\n init_kernel = glorot_uniform,\n bias = true)\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.MUT1Cell","page":"Cells","title":"RecurrentLayers.MUT1Cell","text":"MUT1Cell((input_size => hidden_size);\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nMutated unit 1 cell. See MUT1 for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz = sigma(W_z x_t + b_z) \nr = sigma(W_r x_t + U_r h_t + b_r) \nh_t+1 = tanh(U_h (r odot h_t) + tanh(W_h x_t) + b_h) odot z \nquad + h_t odot (1 - z)\nendaligned\n\nForward\n\nmutcell(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the mutcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the MUTCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.MUT2Cell","page":"Cells","title":"RecurrentLayers.MUT2Cell","text":"MUT2Cell((input_size => hidden_size);\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nMutated unit 2 cell. See MUT2 for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz = sigma(W_z x_t + U_z h_t + b_z) \nr = sigma(x_t + U_r h_t + b_r) \nh_t+1 = tanh(U_h (r odot h_t) + W_h x_t + b_h) odot z \nquad + h_t odot (1 - z)\nendaligned\n\nForward\n\nmutcell(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the mutcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the MUTCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.MUT3Cell","page":"Cells","title":"RecurrentLayers.MUT3Cell","text":"MUT3Cell((input_size => hidden_size);\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nMutated unit 3 cell. See MUT3 for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz = sigma(W_z x_t + U_z tanh(h_t) + b_z) \nr = sigma(W_r x_t + U_r h_t + b_r) \nh_t+1 = tanh(U_h (r odot h_t) + W_h x_t + b_h) odot z \nquad + h_t odot (1 - z)\nendaligned\n\nForward\n\nmutcell(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the mutcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the MUTCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.SCRNCell","page":"Cells","title":"RecurrentLayers.SCRNCell","text":"SCRNCell((input_size => hidden_size)::Pair;\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true,\n alpha = 0.0)\n\nStructurally contraint recurrent unit. See SCRN for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\nalpha: structural contraint. Default is 0.0\n\nEquations\n\nbeginaligned\ns_t = (1 - alpha) W_s x_t + alpha s_t-1 \nh_t = sigma(W_h s_t + U_h h_t-1 + b_h) \ny_t = f(U_y h_t + W_y s_t)\nendaligned\n\nForward\n\nrnncell(inp, [state, c_state])\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.PeepholeLSTMCell","page":"Cells","title":"RecurrentLayers.PeepholeLSTMCell","text":"PeepholeLSTMCell((input_size => hidden_size)::Pair;\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nPeephole long short term memory cell. See PeepholeLSTM for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nf_t = sigma_g(W_f x_t + U_f c_t-1 + b_f) \ni_t = sigma_g(W_i x_t + U_i c_t-1 + b_i) \no_t = sigma_g(W_o x_t + U_o c_t-1 + b_o) \nc_t = f_t odot c_t-1 + i_t odot sigma_c(W_c x_t + b_c) \nh_t = o_t odot sigma_h(c_t)\nendaligned\n\nForward\n\nlstmcell(x, [h, c])\n\nThe forward pass takes the following arguments:\n\nx: Input to the cell, which can be a vector of size in or a matrix of size in x batch_size.\nh: The hidden state vector of the cell, sized out, or a matrix of size out x batch_size.\nc: The candidate state, sized out, or a matrix of size out x batch_size.\n\nIf not provided, both h and c default to vectors of zeros.\n\nExamples\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.FastRNNCell","page":"Cells","title":"RecurrentLayers.FastRNNCell","text":"FastRNNCell((input_size => hidden_size), [activation];\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nFast recurrent neural network cell. See FastRNN for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\nactivation: the activation function, defaults to tanh_fast\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ntildeh_t = sigma(W_h x_t + U_h h_t-1 + b) \nh_t = alpha tildeh_t + beta h_t-1\nendaligned\n\nForward\n\nfastrnncell(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the fastrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the FastRNN. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.FastGRNNCell","page":"Cells","title":"RecurrentLayers.FastGRNNCell","text":"FastGRNNCell((input_size => hidden_size), [activation];\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nFast gated recurrent neural network cell. See FastGRNN for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\nactivation: the activation function, defaults to tanh_fast\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz_t = sigma(W_z x_t + U_z h_t-1 + b_z) \ntildeh_t = tanh(W_h x_t + U_h h_t-1 + b_h) \nh_t = big((zeta (1 - z_t) + nu) odot tildeh_tbig) + z_t odot h_t-1\nendaligned\n\nForward\n\nfastgrnncell(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the fastgrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the FastGRNN. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"roadmap/#Roadmap","page":"Roadmap","title":"Roadmap","text":"","category":"section"},{"location":"roadmap/","page":"Roadmap","title":"Roadmap","text":"This page documents some planned work for RecurrentLayers.jl. Future work for this library includes additional cells such as:","category":"page"},{"location":"roadmap/","page":"Roadmap","title":"Roadmap","text":"FastRNNs and FastGRUs (current focus) arxiv\nUnitary recurrent neural networks arxiv\nModern recurrent neural networks such as LRU and minLSTM/minGRU\nQuasi recurrent neural networks arxiv","category":"page"},{"location":"roadmap/","page":"Roadmap","title":"Roadmap","text":"Additionally, some cell-independent architectures are also planned, that expand the ability of recurrent architectures and could theoretically take any cell:","category":"page"},{"location":"roadmap/","page":"Roadmap","title":"Roadmap","text":"Clockwork rnns arxiv\nPhased rnns arxiv\nSegment rnn arxiv\nFast-Slow rnns arxiv","category":"page"},{"location":"roadmap/","page":"Roadmap","title":"Roadmap","text":"An implementation of these ideally would be, for example FastSlow(RNNCell, input_size => hidden_size). More details on this soon!","category":"page"},{"location":"api/wrappers/#Cell-wrappers","page":"Cell Wrappers","title":"Cell wrappers","text":"","category":"section"},{"location":"api/wrappers/","page":"Cell Wrappers","title":"Cell Wrappers","text":"RAN\nIndRNN\nLightRU\nLiGRU\nMGU\nNAS\nRHN\nMUT1\nMUT2\nMUT3\nSCRN\nPeepholeLSTM\nFastRNN\nFastGRNN","category":"page"},{"location":"api/wrappers/#RecurrentLayers.RAN","page":"Cell Wrappers","title":"RecurrentLayers.RAN","text":"RAN(input_size => hidden_size; kwargs...)\n\nThe RANCell, introduced in this paper, is a recurrent cell layer which provides additional memory through the use of gates.\n\nand returns both ht anf ct.\n\nSee RANCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ntildec_t = W_c x_t \ni_t = sigma(W_i x_t + U_i h_t-1 + b_i) \nf_t = sigma(W_f x_t + U_f h_t-1 + b_f) \nc_t = i_t odot tildec_t + f_t odot c_t-1 \nh_t = g(c_t)\nendaligned\n\n\n\n\n\n","category":"type"},{"location":"api/wrappers/#RecurrentLayers.IndRNN","page":"Cell Wrappers","title":"RecurrentLayers.IndRNN","text":"IndRNN((input_size, hidden_size)::Pair, σ = tanh, σ=relu;\n kwargs...)\n\nIndependently recurrent network. See IndRNNCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\nσ: activation function. Default is tanh\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nmathbfh_t = sigma(mathbfW mathbfx_t + mathbfu odot mathbfh_t-1 + mathbfb)\n\nForward\n\nindrnn(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the indrnn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\nstate: The hidden state of the IndRNN. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns new hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/wrappers/#RecurrentLayers.LightRU","page":"Cell Wrappers","title":"RecurrentLayers.LightRU","text":"LightRU((input_size => hidden_size)::Pair; kwargs...)\n\nLight recurrent unit network. See LightRUCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ntildeh_t = tanh(W_h x_t) \nf_t = delta(W_f x_t + U_f h_t-1 + b_f) \nh_t = (1 - f_t) odot h_t-1 + f_t odot tildeh_t\nendaligned\n\nForward\n\nlightru(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the lightru. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\nstate: The hidden state of the LightRU. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns new hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/wrappers/#RecurrentLayers.LiGRU","page":"Cell Wrappers","title":"RecurrentLayers.LiGRU","text":"LiGRU((input_size => hidden_size)::Pair; kwargs...)\n\nLight gated recurrent network. The implementation does not include the batch normalization as described in the original paper. See LiGRUCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz_t = sigma(W_z x_t + U_z h_t-1) \ntildeh_t = textReLU(W_h x_t + U_h h_t-1) \nh_t = z_t odot h_t-1 + (1 - z_t) odot tildeh_t\nendaligned\n\nForward\n\nligru(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the ligru. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\nstate: The hidden state of the LiGRU. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns new hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/wrappers/#RecurrentLayers.MGU","page":"Cell Wrappers","title":"RecurrentLayers.MGU","text":"MGU((input_size => hidden_size)::Pair; kwargs...)\n\nMinimal gated unit network. See MGUCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nf_t = sigma(W_f x_t + U_f h_t-1 + b_f) \ntildeh_t = tanh(W_h x_t + U_h (f_t odot h_t-1) + b_h) \nh_t = (1 - f_t) odot h_t-1 + f_t odot tildeh_t\nendaligned\n\nForward\n\nmgu(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the mgu. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\nstate: The hidden state of the MGU. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns new hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/wrappers/#RecurrentLayers.NAS","page":"Cell Wrappers","title":"RecurrentLayers.NAS","text":"NAS((input_size => hidden_size)::Pair; kwargs...)\n\nNeural Architecture Search unit. See NASCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ntextFirst Layer Outputs \no_1 = sigma(W_i^(1) x_t + W_h^(1) h_t-1 + b^(1)) \no_2 = textReLU(W_i^(2) x_t + W_h^(2) h_t-1 + b^(2)) \no_3 = sigma(W_i^(3) x_t + W_h^(3) h_t-1 + b^(3)) \no_4 = textReLU(W_i^(4) x_t cdot W_h^(4) h_t-1) \no_5 = tanh(W_i^(5) x_t + W_h^(5) h_t-1 + b^(5)) \no_6 = sigma(W_i^(6) x_t + W_h^(6) h_t-1 + b^(6)) \no_7 = tanh(W_i^(7) x_t + W_h^(7) h_t-1 + b^(7)) \no_8 = sigma(W_i^(8) x_t + W_h^(8) h_t-1 + b^(8)) \n\ntextSecond Layer Computations \nl_1 = tanh(o_1 cdot o_2) \nl_2 = tanh(o_3 + o_4) \nl_3 = tanh(o_5 cdot o_6) \nl_4 = sigma(o_7 + o_8) \n\ntextInject Cell State \nl_1 = tanh(l_1 + c_textstate) \n\ntextFinal Layer Computations \nc_textnew = l_1 cdot l_2 \nl_5 = tanh(l_3 + l_4) \nh_textnew = tanh(c_textnew cdot l_5)\nendaligned\n\n\n\n\n\n","category":"type"},{"location":"api/wrappers/#RecurrentLayers.RHN","page":"Cell Wrappers","title":"RecurrentLayers.RHN","text":"RHN((input_size => hidden_size)::Pair depth=3; kwargs...)\n\nRecurrent highway network. See RHNCellUnit for a the unit component of this layer. See RHNCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ndepth: depth of the recurrence. Default is 3\ncouple_carry: couples the carry gate and the transform gate. Default true\ninit_kernel: initializer for the input to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ns_ell^t = h_ell^t odot t_ell^t + s_ell-1^t odot c_ell^t \ntextwhere \nh_ell^t = tanh(W_h x^tmathbbI_ell = 1 + U_h_ell s_ell-1^t + b_h_ell) \nt_ell^t = sigma(W_t x^tmathbbI_ell = 1 + U_t_ell s_ell-1^t + b_t_ell) \nc_ell^t = sigma(W_c x^tmathbbI_ell = 1 + U_c_ell s_ell-1^t + b_c_ell)\nendaligned\n\n\n\n\n\n","category":"type"},{"location":"api/wrappers/#RecurrentLayers.MUT1","page":"Cell Wrappers","title":"RecurrentLayers.MUT1","text":"MUT1((input_size => hidden_size); kwargs...)\n\nMutated unit 1 network. See MUT1Cell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz = sigma(W_z x_t + b_z) \nr = sigma(W_r x_t + U_r h_t + b_r) \nh_t+1 = tanh(U_h (r odot h_t) + tanh(W_h x_t) + b_h) odot z \nquad + h_t odot (1 - z)\nendaligned\n\nForward\n\nmut(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the mut. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\nstate: The hidden state of the MUT. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns new hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/wrappers/#RecurrentLayers.MUT2","page":"Cell Wrappers","title":"RecurrentLayers.MUT2","text":"MUT2Cell((input_size => hidden_size); kwargs...)\n\nMutated unit 2 network. See MUT2Cell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz = sigma(W_z x_t + U_z h_t + b_z) \nr = sigma(x_t + U_r h_t + b_r) \nh_t+1 = tanh(U_h (r odot h_t) + W_h x_t + b_h) odot z \nquad + h_t odot (1 - z)\nendaligned\n\nForward\n\nmut(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the mut. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\nstate: The hidden state of the MUT. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns new hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/wrappers/#RecurrentLayers.MUT3","page":"Cell Wrappers","title":"RecurrentLayers.MUT3","text":"MUT3((input_size => hidden_size); kwargs...)\n\nMutated unit 3 network. See MUT3Cell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz = sigma(W_z x_t + U_z tanh(h_t) + b_z) \nr = sigma(W_r x_t + U_r h_t + b_r) \nh_t+1 = tanh(U_h (r odot h_t) + W_h x_t + b_h) odot z \nquad + h_t odot (1 - z)\nendaligned\n\nForward\n\nmut(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the mut. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\nstate: The hidden state of the MUT. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns new hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/wrappers/#RecurrentLayers.SCRN","page":"Cell Wrappers","title":"RecurrentLayers.SCRN","text":"SCRN((input_size => hidden_size)::Pair;\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true,\n alpha = 0.0)\n\nStructurally contraint recurrent unit. See SCRNCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\nalpha: structural contraint. Default is 0.0\n\nEquations\n\nbeginaligned\ns_t = (1 - alpha) W_s x_t + alpha s_t-1 \nh_t = sigma(W_h s_t + U_h h_t-1 + b_h) \ny_t = f(U_y h_t + W_y s_t)\nendaligned\n\n\n\n\n\n","category":"type"},{"location":"api/wrappers/#RecurrentLayers.PeepholeLSTM","page":"Cell Wrappers","title":"RecurrentLayers.PeepholeLSTM","text":"PeepholeLSTM((input_size => hidden_size)::Pair; kwargs...)\n\nPeephole long short term memory network. See PeepholeLSTMCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginalign\nf_t = sigma_g(W_f x_t + U_f c_t-1 + b_f) \ni_t = sigma_g(W_i x_t + U_i c_t-1 + b_i) \no_t = sigma_g(W_o x_t + U_o c_t-1 + b_o) \nc_t = f_t odot c_t-1 + i_t odot sigma_c(W_c x_t + b_c) \nh_t = o_t odot sigma_h(c_t)\nendalign\n\n\n\n\n\n","category":"type"},{"location":"api/wrappers/#RecurrentLayers.FastRNN","page":"Cell Wrappers","title":"RecurrentLayers.FastRNN","text":"FastRNN((input_size => hidden_size), [activation]; kwargs...)\n\nFast recurrent neural network. See FastRNNCell for a layer that processes a single sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\nactivation: the activation function, defaults to tanh_fast\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ntildeh_t = sigma(W_h x_t + U_h h_t-1 + b) \nh_t = alpha tildeh_t + beta h_t-1\nendaligned\n\nForward\n\nfastrnn(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the fastrnn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\nstate: The hidden state of the FastRNN. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns new hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/wrappers/#RecurrentLayers.FastGRNN","page":"Cell Wrappers","title":"RecurrentLayers.FastGRNN","text":"FastGRNN((input_size => hidden_size), [activation]; kwargs...)\n\nFast recurrent neural network. See FastGRNNCell for a layer that processes a single sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\nactivation: the activation function, defaults to tanh_fast\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz_t = sigma(W_z x_t + U_z h_t-1 + b_z) \ntildeh_t = tanh(W_h x_t + U_h h_t-1 + b_h) \nh_t = big((zeta (1 - z_t) + nu) odot tildeh_tbig) + z_t odot h_t-1\nendaligned\n\nForward\n\nfastgrnn(inp, [state])\n\nThe arguments of the forward pass are:\n\ninp: The input to the fastgrnn. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the FastGRNN. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns a tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"","page":"Home","title":"Home","text":"CurrentModule = RecurrentLayers","category":"page"},{"location":"#RecurrentLayers","page":"Home","title":"RecurrentLayers","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"RecurrentLayers.jl extends Flux.jl recurrent layers offering by providing implementations of bleeding edge recurrent layers not commonly available in base deep learning libraries. It is designed for a seamless integration with the larger Flux ecosystem, enabling researchers and practitioners to leverage the latest developments in recurrent neural networks.","category":"page"},{"location":"#Implemented-layers","page":"Home","title":"Implemented layers","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Minimal gated unit as MGUCell arxiv\nLight gated recurrent unit as LiGRUCell arxiv\nIndependently recurrent neural networks as IndRNNCell arxiv\nRecurrent addictive networks as RANCell arxiv\nRecurrent highway network as RHNCell arixv\nLight recurrent unit as LightRUCell pub\nNeural architecture search unit NASCell arxiv\nEvolving recurrent neural networks as MUT1Cell, MUT2Cell, MUT3Cell pub\nStructurally constrained recurrent neural network as SCRNCell arxiv\nPeephole long short term memory as PeepholeLSTMCell pub\nFastRNNCell and FastGRNNCell arxiv","category":"page"},{"location":"#Contributing","page":"Home","title":"Contributing","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Contributions are always welcome! We specifically look for :","category":"page"},{"location":"","page":"Home","title":"Home","text":"Recurrent cells you would like to see implemented \nBenchmarks\nAny bugs and mistakes of course!\nDocumentation, in any form: examples, how tos, docstrings ","category":"page"}] }