From 52b405e5d0e880bcfe80687568d58e29b21bdd09 Mon Sep 17 00:00:00 2001 From: "Documenter.jl" Date: Tue, 17 Dec 2024 10:29:59 +0000 Subject: [PATCH] build based on 2b043c1 --- dev/.documenter-siteinfo.json | 2 +- dev/api/cells/index.html | 30 +++++++++++++++--------------- dev/api/layers/index.html | 28 ++++++++++++++-------------- dev/index.html | 2 +- dev/roadmap/index.html | 2 +- dev/search_index.js | 2 +- 6 files changed, 33 insertions(+), 33 deletions(-) diff --git a/dev/.documenter-siteinfo.json b/dev/.documenter-siteinfo.json index ec1ff9c..92068ee 100644 --- a/dev/.documenter-siteinfo.json +++ b/dev/.documenter-siteinfo.json @@ -1 +1 @@ -{"documenter":{"julia_version":"1.11.2","generation_timestamp":"2024-12-17T08:48:46","documenter_version":"1.8.0"}} \ No newline at end of file +{"documenter":{"julia_version":"1.11.2","generation_timestamp":"2024-12-17T10:29:51","documenter_version":"1.8.0"}} \ No newline at end of file diff --git a/dev/api/cells/index.html b/dev/api/cells/index.html index c462d2e..10b568b 100644 --- a/dev/api/cells/index.html +++ b/dev/api/cells/index.html @@ -9,11 +9,11 @@ c_t &= i_t \odot \tilde{c}_t + f_t \odot c_{t-1}, \\ h_t &= g(c_t) \end{aligned}\]

Forward

rancell(inp, (state, cstate))
-rancell(inp)

Arguments

Returns

source
RecurrentLayers.IndRNNCellType
IndRNNCell((input_size => hidden_size)::Pair, σ=relu;
+rancell(inp)

Arguments

  • inp: The input to the rancell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the RANCell. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where output = new_state is the new hidden state and state = (new_state, new_cstate) is the new hidden and cell state. They are tensors of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.IndRNNCellType
IndRNNCell((input_size => hidden_size)::Pair, σ=relu;
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Independently recurrent cell. See IndRNN for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • σ: activation function. Default is tanh
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\mathbf{h}_{t} = \sigma(\mathbf{W} \mathbf{x}_t + \mathbf{u} \odot \mathbf{h}_{t-1} + \mathbf{b})\]

Forward

indrnncell(inp, state)
-indrnncell(inp)

Arguments

  • inp: The input to the indrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the IndRNNCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.LightRUCellType
LightRUCell((input_size => hidden_size)::Pair;
+indrnncell(inp)

Arguments

  • inp: The input to the indrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the IndRNNCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.LightRUCellType
LightRUCell((input_size => hidden_size)::Pair;
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Light recurrent unit. See LightRU for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} @@ -21,7 +21,7 @@ f_t &= \delta(W_f x_t + U_f h_{t-1} + b_f), \\ h_t &= (1 - f_t) \odot h_{t-1} + f_t \odot \tilde{h}_t. \end{aligned}\]

Forward

lightrucell(inp, state)
-lightrucell(inp)

Arguments

  • inp: The input to the lightrucell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the LightRUCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.LiGRUCellType
LiGRUCell((input_size => hidden_size)::Pair;
+lightrucell(inp)

Arguments

  • inp: The input to the lightrucell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the LightRUCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.LiGRUCellType
LiGRUCell((input_size => hidden_size)::Pair;
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Light gated recurrent unit. The implementation does not include the batch normalization as described in the original paper. See LiGRU for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} @@ -29,7 +29,7 @@ \tilde{h}_t &= \text{ReLU}(W_h x_t + U_h h_{t-1}), \\ h_t &= z_t \odot h_{t-1} + (1 - z_t) \odot \tilde{h}_t \end{aligned}\]

Forward

ligrucell(inp, state)
-ligrucell(inp)

Arguments

  • inp: The input to the ligrucell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the LiGRUCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.MGUCellType
MGUCell((input_size => hidden_size)::Pair;
+ligrucell(inp)

Arguments

  • inp: The input to the ligrucell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the LiGRUCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.MGUCellType
MGUCell((input_size => hidden_size)::Pair;
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Minimal gated unit. See MGU for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} @@ -37,7 +37,7 @@ \tilde{h}_t &= \tanh(W_h x_t + U_h (f_t \odot h_{t-1}) + b_h), \\ h_t &= (1 - f_t) \odot h_{t-1} + f_t \odot \tilde{h}_t \end{aligned}\]

Forward

mgucell(inp, state)
-mgucell(inp)

Arguments

  • inp: The input to the mgucell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the MGUCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.NASCellType
NASCell((input_size => hidden_size);
+mgucell(inp)

Arguments

  • inp: The input to the mgucell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the MGUCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.NASCellType
NASCell((input_size => hidden_size);
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Neural Architecture Search unit. See NAS for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} @@ -65,7 +65,7 @@ l_5 &= \tanh(l_3 + l_4) \\ h_{\text{new}} &= \tanh(c_{\text{new}} \cdot l_5) \end{aligned}\]

Forward

nascell(inp, (state, cstate))
-nascell(inp)

Arguments

  • inp: The input to the fastrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the NASCell. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros.

Returns

  • A tuple (output, state), where output = new_state is the new hidden state and state = (new_state, new_cstate) is the new hidden and cell state. They are tensors of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.RHNCellType
RHNCell((input_size => hidden_size), depth=3;
+nascell(inp)

Arguments

  • inp: The input to the fastrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the NASCell. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where output = new_state is the new hidden state and state = (new_state, new_cstate) is the new hidden and cell state. They are tensors of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.RHNCellType
RHNCell((input_size => hidden_size), depth=3;
     couple_carry::Bool = true,
     cell_kwargs...)

Recurrent highway network. See RHNCellUnit for a the unit component of this layer. See RHN for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • depth: depth of the recurrence. Default is 3
  • couple_carry: couples the carry gate and the transform gate. Default true
  • init_kernel: initializer for the input to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} s_{\ell}^{[t]} &= h_{\ell}^{[t]} \odot t_{\ell}^{[t]} + s_{\ell-1}^{[t]} \odot c_{\ell}^{[t]}, \\ @@ -73,9 +73,9 @@ h_{\ell}^{[t]} &= \tanh(W_h x^{[t]}\mathbb{I}_{\ell = 1} + U_{h_{\ell}} s_{\ell-1}^{[t]} + b_{h_{\ell}}), \\ t_{\ell}^{[t]} &= \sigma(W_t x^{[t]}\mathbb{I}_{\ell = 1} + U_{t_{\ell}} s_{\ell-1}^{[t]} + b_{t_{\ell}}), \\ c_{\ell}^{[t]} &= \sigma(W_c x^{[t]}\mathbb{I}_{\ell = 1} + U_{c_{\ell}} s_{\ell-1}^{[t]} + b_{c_{\ell}}) -\end{aligned}\]

Forward

rnncell(inp, [state])
source
RecurrentLayers.RHNCellUnitType
RHNCellUnit((input_size => hidden_size)::Pair;
+\end{aligned}\]

Forward

rnncell(inp, [state])
source
RecurrentLayers.RHNCellUnitType
RHNCellUnit((input_size => hidden_size)::Pair;
     init_kernel = glorot_uniform,
-    bias = true)
source
RecurrentLayers.MUT1CellType
MUT1Cell((input_size => hidden_size);
+    bias = true)
source
RecurrentLayers.MUT1CellType
MUT1Cell((input_size => hidden_size);
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Mutated unit 1 cell. See MUT1 for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} @@ -84,7 +84,7 @@ h_{t+1} &= \tanh(U_h (r \odot h_t) + \tanh(W_h x_t) + b_h) \odot z \\ &\quad + h_t \odot (1 - z). \end{aligned}\]

Forward

mutcell(inp, state)
-mutcell(inp)

Arguments

  • inp: The input to the mutcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the MUTCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state,

a tensor of size hidden_size or hidden_size x batch_size.

source
RecurrentLayers.MUT2CellType
MUT2Cell((input_size => hidden_size);
+mutcell(inp)

Arguments

  • inp: The input to the mutcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the MUTCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state,

a tensor of size hidden_size or hidden_size x batch_size.

source
RecurrentLayers.MUT2CellType
MUT2Cell((input_size => hidden_size);
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Mutated unit 2 cell. See MUT2 for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} @@ -93,7 +93,7 @@ h_{t+1} &= \tanh(U_h (r \odot h_t) + W_h x_t + b_h) \odot z \\ &\quad + h_t \odot (1 - z). \end{aligned}\]

Forward

mutcell(inp, state)
-mutcell(inp)

Arguments

  • inp: The input to the mutcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the MUTCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state,

a tensor of size hidden_size or hidden_size x batch_size.

source
RecurrentLayers.MUT3CellType
MUT3Cell((input_size => hidden_size);
+mutcell(inp)

Arguments

  • inp: The input to the mutcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the MUTCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state,

a tensor of size hidden_size or hidden_size x batch_size.

source
RecurrentLayers.MUT3CellType
MUT3Cell((input_size => hidden_size);
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Mutated unit 3 cell. See MUT3 for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} @@ -102,7 +102,7 @@ h_{t+1} &= \tanh(U_h (r \odot h_t) + W_h x_t + b_h) \odot z \\ &\quad + h_t \odot (1 - z). \end{aligned}\]

Forward

mutcell(inp, state)
-mutcell(inp)

Arguments

  • inp: The input to the mutcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the MUTCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state,

a tensor of size hidden_size or hidden_size x batch_size.

source
RecurrentLayers.SCRNCellType
SCRNCell((input_size => hidden_size)::Pair;
+mutcell(inp)

Arguments

  • inp: The input to the mutcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the MUTCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state,

a tensor of size hidden_size or hidden_size x batch_size.

source
RecurrentLayers.SCRNCellType
SCRNCell((input_size => hidden_size)::Pair;
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true,
@@ -111,7 +111,7 @@
 h_t &= \sigma(W_h s_t + U_h h_{t-1} + b_h), \\
 y_t &= f(U_y h_t + W_y s_t)
 \end{aligned}\]

Forward

scrncell(inp, (state, cstate))
-scrncell(inp)

Arguments

  • inp: The input to the scrncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the SCRNCell. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros.

Returns

  • A tuple (output, state), where output = new_state is the new hidden state and state = (new_state, new_cstate) is the new hidden and cell state. They are tensors of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.PeepholeLSTMCellType
PeepholeLSTMCell((input_size => hidden_size)::Pair;
+scrncell(inp)

Arguments

  • inp: The input to the scrncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the SCRNCell. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where output = new_state is the new hidden state and state = (new_state, new_cstate) is the new hidden and cell state. They are tensors of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.PeepholeLSTMCellType
PeepholeLSTMCell((input_size => hidden_size)::Pair;
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Peephole long short term memory cell. See PeepholeLSTM for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} @@ -121,14 +121,14 @@ c_t &= f_t \odot c_{t-1} + i_t \odot \sigma_c(W_c x_t + b_c), \\ h_t &= o_t \odot \sigma_h(c_t). \end{aligned}\]

Forward

peepholelstmcell(inp, (state, cstate))
-peepholelstmcell(inp)

Arguments

  • inp: The input to the peepholelstmcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the PeepholeLSTMCell. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros.

Returns

  • A tuple (output, state), where output = new_state is the new hidden state and state = (new_state, new_cstate) is the new hidden and cell state. They are tensors of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.FastRNNCellType
FastRNNCell((input_size => hidden_size), [activation];
+peepholelstmcell(inp)

Arguments

  • inp: The input to the peepholelstmcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the PeepholeLSTMCell. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where output = new_state is the new hidden state and state = (new_state, new_cstate) is the new hidden and cell state. They are tensors of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.FastRNNCellType
FastRNNCell((input_size => hidden_size), [activation];
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Fast recurrent neural network cell. See FastRNN for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • activation: the activation function, defaults to tanh_fast
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} \tilde{h}_t &= \sigma(W_h x_t + U_h h_{t-1} + b), \\ h_t &= \alpha \tilde{h}_t + \beta h_{t-1} \end{aligned}\]

Forward

fastrnncell(inp, state)
-fastrnncell(inp)

Arguments

  • inp: The input to the fastrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the FastRNN. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.FastGRNNCellType
FastGRNNCell((input_size => hidden_size), [activation];
+fastrnncell(inp)

Arguments

  • inp: The input to the fastrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the FastRNN. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
RecurrentLayers.FastGRNNCellType
FastGRNNCell((input_size => hidden_size), [activation];
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true)

Fast gated recurrent neural network cell. See FastGRNN for a layer that processes entire sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • activation: the activation function, defaults to tanh_fast
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} @@ -136,4 +136,4 @@ \tilde{h}_t &= \tanh(W_h x_t + U_h h_{t-1} + b_h), \\ h_t &= \big((\zeta (1 - z_t) + \nu) \odot \tilde{h}_t\big) + z_t \odot h_{t-1} \end{aligned}\]

Forward

fastgrnncell(inp, state)
-fastgrnncell(inp)

Arguments

  • inp: The input to the fastgrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the FastGRNN. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns

  • A tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.
source
+fastgrnncell(inp)

Arguments

Returns

source diff --git a/dev/api/layers/index.html b/dev/api/layers/index.html index 48f0f3e..841b9ea 100644 --- a/dev/api/layers/index.html +++ b/dev/api/layers/index.html @@ -6,24 +6,24 @@ c_t &= i_t \odot \tilde{c}_t + f_t \odot c_{t-1}, \\ h_t &= g(c_t) \end{aligned}\]

Forward

ran(inp, (state, cstate))
-ran(inp)

Arguments

Returns

source
RecurrentLayers.IndRNNType
IndRNN((input_size, hidden_size)::Pair, σ = tanh, σ=relu;
+ran(inp)

Arguments

  • inp: The input to the ran. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the RAN. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.IndRNNType
IndRNN((input_size, hidden_size)::Pair, σ = tanh, σ=relu;
     kwargs...)

Independently recurrent network. See IndRNNCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • σ: activation function. Default is tanh
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\mathbf{h}_{t} = \sigma(\mathbf{W} \mathbf{x}_t + \mathbf{u} \odot \mathbf{h}_{t-1} + \mathbf{b})\]

Forward

indrnn(inp, state)
-indrnn(inp)

Arguments

  • inp: The input to the indrnn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the IndRNN. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.LightRUType
LightRU((input_size => hidden_size)::Pair; kwargs...)

Light recurrent unit network. See LightRUCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +indrnn(inp)

Arguments

  • inp: The input to the indrnn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the IndRNN. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.LightRUType
LightRU((input_size => hidden_size)::Pair; kwargs...)

Light recurrent unit network. See LightRUCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} \tilde{h}_t &= \tanh(W_h x_t), \\ f_t &= \delta(W_f x_t + U_f h_{t-1} + b_f), \\ h_t &= (1 - f_t) \odot h_{t-1} + f_t \odot \tilde{h}_t. \end{aligned}\]

Forward

lightru(inp, state)
-lightru(inp)

Arguments

  • inp: The input to the lightru. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the LightRU. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.LiGRUType
LiGRU((input_size => hidden_size)::Pair; kwargs...)

Light gated recurrent network. The implementation does not include the batch normalization as described in the original paper. See LiGRUCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +lightru(inp)

Arguments

  • inp: The input to the lightru. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the LightRU. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.LiGRUType
LiGRU((input_size => hidden_size)::Pair; kwargs...)

Light gated recurrent network. The implementation does not include the batch normalization as described in the original paper. See LiGRUCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} z_t &= \sigma(W_z x_t + U_z h_{t-1}), \\ \tilde{h}_t &= \text{ReLU}(W_h x_t + U_h h_{t-1}), \\ h_t &= z_t \odot h_{t-1} + (1 - z_t) \odot \tilde{h}_t \end{aligned}\]

Forward

ligru(inp, state)
-ligru(inp)

Arguments

  • inp: The input to the ligru. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the LiGRU. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.MGUType
MGU((input_size => hidden_size)::Pair; kwargs...)

Minimal gated unit network. See MGUCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +ligru(inp)

Arguments

  • inp: The input to the ligru. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the LiGRU. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.MGUType
MGU((input_size => hidden_size)::Pair; kwargs...)

Minimal gated unit network. See MGUCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} f_t &= \sigma(W_f x_t + U_f h_{t-1} + b_f), \\ \tilde{h}_t &= \tanh(W_h x_t + U_h (f_t \odot h_{t-1}) + b_h), \\ h_t &= (1 - f_t) \odot h_{t-1} + f_t \odot \tilde{h}_t \end{aligned}\]

Forward

mgu(inp, state)
-mgu(inp)

Arguments

  • inp: The input to the mgu. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the MGU. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.NASType
NAS((input_size => hidden_size)::Pair; kwargs...)

Neural Architecture Search unit. See NASCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +mgu(inp)

Arguments

  • inp: The input to the mgu. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the MGU. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.NASType
NAS((input_size => hidden_size)::Pair; kwargs...)

Neural Architecture Search unit. See NASCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} \text{First Layer Outputs:} & \\ o_1 &= \sigma(W_i^{(1)} x_t + W_h^{(1)} h_{t-1} + b^{(1)}), \\ o_2 &= \text{ReLU}(W_i^{(2)} x_t + W_h^{(2)} h_{t-1} + b^{(2)}), \\ @@ -48,31 +48,31 @@ l_5 &= \tanh(l_3 + l_4) \\ h_{\text{new}} &= \tanh(c_{\text{new}} \cdot l_5) \end{aligned}\]

Forward

nas(inp, (state, cstate))
-nas(inp)

Arguments

  • inp: The input to the nas. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the NAS. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.RHNType
RHN((input_size => hidden_size)::Pair depth=3; kwargs...)

Recurrent highway network. See RHNCellUnit for a the unit component of this layer. See RHNCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • depth: depth of the recurrence. Default is 3
  • couple_carry: couples the carry gate and the transform gate. Default true
  • init_kernel: initializer for the input to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +nas(inp)

Arguments

  • inp: The input to the nas. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the NAS. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.RHNType
RHN((input_size => hidden_size)::Pair depth=3; kwargs...)

Recurrent highway network. See RHNCellUnit for a the unit component of this layer. See RHNCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • depth: depth of the recurrence. Default is 3
  • couple_carry: couples the carry gate and the transform gate. Default true
  • init_kernel: initializer for the input to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} s_{\ell}^{[t]} &= h_{\ell}^{[t]} \odot t_{\ell}^{[t]} + s_{\ell-1}^{[t]} \odot c_{\ell}^{[t]}, \\ \text{where} \\ h_{\ell}^{[t]} &= \tanh(W_h x^{[t]}\mathbb{I}_{\ell = 1} + U_{h_{\ell}} s_{\ell-1}^{[t]} + b_{h_{\ell}}), \\ t_{\ell}^{[t]} &= \sigma(W_t x^{[t]}\mathbb{I}_{\ell = 1} + U_{t_{\ell}} s_{\ell-1}^{[t]} + b_{t_{\ell}}), \\ c_{\ell}^{[t]} &= \sigma(W_c x^{[t]}\mathbb{I}_{\ell = 1} + U_{c_{\ell}} s_{\ell-1}^{[t]} + b_{c_{\ell}}) -\end{aligned}\]

source
RecurrentLayers.MUT1Type
MUT1((input_size => hidden_size); kwargs...)

Mutated unit 1 network. See MUT1Cell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +\end{aligned}\]

source
RecurrentLayers.MUT1Type
MUT1((input_size => hidden_size); kwargs...)

Mutated unit 1 network. See MUT1Cell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} z &= \sigma(W_z x_t + b_z), \\ r &= \sigma(W_r x_t + U_r h_t + b_r), \\ h_{t+1} &= \tanh(U_h (r \odot h_t) + \tanh(W_h x_t) + b_h) \odot z \\ &\quad + h_t \odot (1 - z). \end{aligned}\]

Forward

mut(inp, state)
-mut(inp)

Arguments

  • inp: The input to the mut. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the MUT. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.MUT2Type
MUT2Cell((input_size => hidden_size); kwargs...)

Mutated unit 2 network. See MUT2Cell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +mut(inp)

Arguments

  • inp: The input to the mut. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the MUT. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.MUT2Type
MUT2Cell((input_size => hidden_size); kwargs...)

Mutated unit 2 network. See MUT2Cell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} z &= \sigma(W_z x_t + U_z h_t + b_z), \\ r &= \sigma(x_t + U_r h_t + b_r), \\ h_{t+1} &= \tanh(U_h (r \odot h_t) + W_h x_t + b_h) \odot z \\ &\quad + h_t \odot (1 - z). \end{aligned}\]

Forward

mut(inp, state)
-mut(inp)

Arguments

  • inp: The input to the mut. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the MUT. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.MUT3Type
MUT3((input_size => hidden_size); kwargs...)

Mutated unit 3 network. See MUT3Cell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +mut(inp)

Arguments

  • inp: The input to the mut. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the MUT. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.MUT3Type
MUT3((input_size => hidden_size); kwargs...)

Mutated unit 3 network. See MUT3Cell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} z &= \sigma(W_z x_t + U_z \tanh(h_t) + b_z), \\ r &= \sigma(W_r x_t + U_r h_t + b_r), \\ h_{t+1} &= \tanh(U_h (r \odot h_t) + W_h x_t + b_h) \odot z \\ &\quad + h_t \odot (1 - z). \end{aligned}\]

Forward

mut(inp, state)
-mut(inp)

Arguments

  • inp: The input to the mut. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the MUT. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.SCRNType
SCRN((input_size => hidden_size)::Pair;
+mut(inp)

Arguments

  • inp: The input to the mut. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the MUT. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.SCRNType
SCRN((input_size => hidden_size)::Pair;
     init_kernel = glorot_uniform,
     init_recurrent_kernel = glorot_uniform,
     bias = true,
@@ -81,20 +81,20 @@
 h_t &= \sigma(W_h s_t + U_h h_{t-1} + b_h), \\
 y_t &= f(U_y h_t + W_y s_t)
 \end{aligned}\]

Forward

scrn(inp, (state, cstate))
-scrn(inp)

Arguments

  • inp: The input to the scrn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the SCRN. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.PeepholeLSTMType
PeepholeLSTM((input_size => hidden_size)::Pair; kwargs...)

Peephole long short term memory network. See PeepholeLSTMCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{align} +scrn(inp)

Arguments

  • inp: The input to the scrn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the SCRN. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.PeepholeLSTMType
PeepholeLSTM((input_size => hidden_size)::Pair; kwargs...)

Peephole long short term memory network. See PeepholeLSTMCell for a layer that processes a single sequence.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{align} f_t &= \sigma_g(W_f x_t + U_f c_{t-1} + b_f), \\ i_t &= \sigma_g(W_i x_t + U_i c_{t-1} + b_i), \\ o_t &= \sigma_g(W_o x_t + U_o c_{t-1} + b_o), \\ c_t &= f_t \odot c_{t-1} + i_t \odot \sigma_c(W_c x_t + b_c), \\ h_t &= o_t \odot \sigma_h(c_t). \end{align}\]

Forward

peepholelstm(inp, (state, cstate))
-peepholelstm(inp)

Arguments

  • inp: The input to the peepholelstm. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the PeepholeLSTM. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.FastRNNType
FastRNN((input_size => hidden_size), [activation]; kwargs...)

Fast recurrent neural network. See FastRNNCell for a layer that processes a single sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • activation: the activation function, defaults to tanh_fast
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +peepholelstm(inp)

Arguments

  • inp: The input to the peepholelstm. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • (state, cstate): A tuple containing the hidden and cell states of the PeepholeLSTM. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.FastRNNType
FastRNN((input_size => hidden_size), [activation]; kwargs...)

Fast recurrent neural network. See FastRNNCell for a layer that processes a single sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • activation: the activation function, defaults to tanh_fast
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} \tilde{h}_t &= \sigma(W_h x_t + U_h h_{t-1} + b), \\ h_t &= \alpha \tilde{h}_t + \beta h_{t-1} \end{aligned}\]

Forward

fastrnn(inp, state)
-fastrnn(inp)

Arguments

  • inp: The input to the fastrnn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the FastRNN. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.FastGRNNType
FastGRNN((input_size => hidden_size), [activation]; kwargs...)

Fast recurrent neural network. See FastGRNNCell for a layer that processes a single sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • activation: the activation function, defaults to tanh_fast
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} +fastrnn(inp)

Arguments

  • inp: The input to the fastrnn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.
  • state: The hidden state of the FastRNN. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
RecurrentLayers.FastGRNNType
FastGRNN((input_size => hidden_size), [activation]; kwargs...)

Fast recurrent neural network. See FastGRNNCell for a layer that processes a single sequences.

Arguments

  • input_size => hidden_size: input and inner dimension of the layer
  • activation: the activation function, defaults to tanh_fast
  • init_kernel: initializer for the input to hidden weights
  • init_recurrent_kernel: initializer for the hidden to hidden weights
  • bias: include a bias or not. Default is true

Equations

\[\begin{aligned} z_t &= \sigma(W_z x_t + U_z h_{t-1} + b_z), \\ \tilde{h}_t &= \tanh(W_h x_t + U_h h_{t-1} + b_h), \\ h_t &= \big((\zeta (1 - z_t) + \nu) \odot \tilde{h}_t\big) + z_t \odot h_{t-1} \end{aligned}\]

Forward

fastgrnn(inp, state)
-fastgrnn(inp)

Arguments

  • inp: The input to the fastgrnn. It should be a vector of size input_size or a matrix of size input_size x batch_size.
  • state: The hidden state of the FastGRNN. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.

Returns

  • New hidden states new_states as an array of size hidden_size x len x batch_size.
source
+fastgrnn(inp)

Arguments

Returns

source diff --git a/dev/index.html b/dev/index.html index 58674d6..1b857f8 100644 --- a/dev/index.html +++ b/dev/index.html @@ -1,2 +1,2 @@ -Home · RecurrentLayers.jl

RecurrentLayers

RecurrentLayers.jl extends Flux.jl recurrent layers offering by providing implementations of bleeding edge recurrent layers not commonly available in base deep learning libraries. It is designed for a seamless integration with the larger Flux ecosystem, enabling researchers and practitioners to leverage the latest developments in recurrent neural networks.

Implemented layers

  • Minimal gated unit as MGUCell arxiv
  • Light gated recurrent unit as LiGRUCell arxiv
  • Independently recurrent neural networks as IndRNNCell arxiv
  • Recurrent addictive networks as RANCell arxiv
  • Recurrent highway network as RHNCell arixv
  • Light recurrent unit as LightRUCell pub
  • Neural architecture search unit NASCell arxiv
  • Evolving recurrent neural networks as MUT1Cell, MUT2Cell, MUT3Cell pub
  • Structurally constrained recurrent neural network as SCRNCell arxiv
  • Peephole long short term memory as PeepholeLSTMCell pub
  • FastRNNCell and FastGRNNCell arxiv

Contributing

Contributions are always welcome! We specifically look for :

  • Recurrent cells you would like to see implemented
  • Benchmarks
  • Any bugs and mistakes of course!
  • Documentation, in any form: examples, how tos, docstrings
+Home · RecurrentLayers.jl

RecurrentLayers

RecurrentLayers.jl extends Flux.jl recurrent layers offering by providing implementations of bleeding edge recurrent layers not commonly available in base deep learning libraries. It is designed for a seamless integration with the larger Flux ecosystem, enabling researchers and practitioners to leverage the latest developments in recurrent neural networks.

Implemented layers

  • Minimal gated unit as MGUCell arxiv
  • Light gated recurrent unit as LiGRUCell arxiv
  • Independently recurrent neural networks as IndRNNCell arxiv
  • Recurrent addictive networks as RANCell arxiv
  • Recurrent highway network as RHNCell arixv
  • Light recurrent unit as LightRUCell pub
  • Neural architecture search unit NASCell arxiv
  • Evolving recurrent neural networks as MUT1Cell, MUT2Cell, MUT3Cell pub
  • Structurally constrained recurrent neural network as SCRNCell arxiv
  • Peephole long short term memory as PeepholeLSTMCell pub
  • FastRNNCell and FastGRNNCell arxiv

Contributing

Contributions are always welcome! We specifically look for :

  • Recurrent cells you would like to see implemented
  • Benchmarks
  • Any bugs and mistakes of course!
  • Documentation, in any form: examples, how tos, docstrings
diff --git a/dev/roadmap/index.html b/dev/roadmap/index.html index ce5c103..63f688e 100644 --- a/dev/roadmap/index.html +++ b/dev/roadmap/index.html @@ -1,2 +1,2 @@ -Roadmap · RecurrentLayers.jl

Roadmap

This page documents some planned work for RecurrentLayers.jl. Future work for this library includes additional cells such as:

  • FastRNNs and FastGRUs (current focus) arxiv
  • Unitary recurrent neural networks arxiv
  • Modern recurrent neural networks such as LRU and minLSTM/minGRU
  • Quasi recurrent neural networks arxiv

Additionally, some cell-independent architectures are also planned, that expand the ability of recurrent architectures and could theoretically take any cell:

An implementation of these ideally would be, for example FastSlow(RNNCell, input_size => hidden_size). More details on this soon!

+Roadmap · RecurrentLayers.jl

Roadmap

This page documents some planned work for RecurrentLayers.jl. Future work for this library includes additional cells such as:

  • FastRNNs and FastGRUs (current focus) arxiv
  • Unitary recurrent neural networks arxiv
  • Modern recurrent neural networks such as LRU and minLSTM/minGRU
  • Quasi recurrent neural networks arxiv

Additionally, some cell-independent architectures are also planned, that expand the ability of recurrent architectures and could theoretically take any cell:

An implementation of these ideally would be, for example FastSlow(RNNCell, input_size => hidden_size). More details on this soon!

diff --git a/dev/search_index.js b/dev/search_index.js index f6ecf73..711f692 100644 --- a/dev/search_index.js +++ b/dev/search_index.js @@ -1,3 +1,3 @@ var documenterSearchIndex = {"docs": -[{"location":"api/cells/#Cells","page":"Cells","title":"Cells","text":"","category":"section"},{"location":"api/cells/","page":"Cells","title":"Cells","text":"RANCell\nIndRNNCell\nLightRUCell\nLiGRUCell\nMGUCell\nNASCell\nRHNCell\nRHNCellUnit\nMUT1Cell\nMUT2Cell\nMUT3Cell\nSCRNCell\nPeepholeLSTMCell\nFastRNNCell\nFastGRNNCell","category":"page"},{"location":"api/cells/#RecurrentLayers.RANCell","page":"Cells","title":"RecurrentLayers.RANCell","text":"RANCell((input_size => hidden_size)::Pair;\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nThe RANCell, introduced in this paper, is a recurrent cell layer which provides additional memory through the use of gates.\n\nSee RAN for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ntildec_t = W_c x_t \ni_t = sigma(W_i x_t + U_i h_t-1 + b_i) \nf_t = sigma(W_f x_t + U_f h_t-1 + b_f) \nc_t = i_t odot tildec_t + f_t odot c_t-1 \nh_t = g(c_t)\nendaligned\n\nForward\n\nrancell(inp, (state, cstate))\nrancell(inp)\n\nArguments\n\ninp: The input to the rancell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\n(state, cstate): A tuple containing the hidden and cell states of the RANCell. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros.\n\nReturns\n\nA tuple (output, state), where output = new_state is the new hidden state and state = (new_state, new_cstate) is the new hidden and cell state. They are tensors of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.IndRNNCell","page":"Cells","title":"RecurrentLayers.IndRNNCell","text":"IndRNNCell((input_size => hidden_size)::Pair, σ=relu;\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nIndependently recurrent cell. See IndRNN for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\nσ: activation function. Default is tanh\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nmathbfh_t = sigma(mathbfW mathbfx_t + mathbfu odot mathbfh_t-1 + mathbfb)\n\nForward\n\nindrnncell(inp, state)\nindrnncell(inp)\n\nArguments\n\ninp: The input to the indrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the IndRNNCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns\n\nA tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.LightRUCell","page":"Cells","title":"RecurrentLayers.LightRUCell","text":"LightRUCell((input_size => hidden_size)::Pair;\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nLight recurrent unit. See LightRU for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ntildeh_t = tanh(W_h x_t) \nf_t = delta(W_f x_t + U_f h_t-1 + b_f) \nh_t = (1 - f_t) odot h_t-1 + f_t odot tildeh_t\nendaligned\n\nForward\n\nlightrucell(inp, state)\nlightrucell(inp)\n\nArguments\n\ninp: The input to the lightrucell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the LightRUCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns\n\nA tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.LiGRUCell","page":"Cells","title":"RecurrentLayers.LiGRUCell","text":"LiGRUCell((input_size => hidden_size)::Pair;\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nLight gated recurrent unit. The implementation does not include the batch normalization as described in the original paper. See LiGRU for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz_t = sigma(W_z x_t + U_z h_t-1) \ntildeh_t = textReLU(W_h x_t + U_h h_t-1) \nh_t = z_t odot h_t-1 + (1 - z_t) odot tildeh_t\nendaligned\n\nForward\n\nligrucell(inp, state)\nligrucell(inp)\n\nArguments\n\ninp: The input to the ligrucell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the LiGRUCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns\n\nA tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.MGUCell","page":"Cells","title":"RecurrentLayers.MGUCell","text":"MGUCell((input_size => hidden_size)::Pair;\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nMinimal gated unit. See MGU for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nf_t = sigma(W_f x_t + U_f h_t-1 + b_f) \ntildeh_t = tanh(W_h x_t + U_h (f_t odot h_t-1) + b_h) \nh_t = (1 - f_t) odot h_t-1 + f_t odot tildeh_t\nendaligned\n\nForward\n\nmgucell(inp, state)\nmgucell(inp)\n\nArguments\n\ninp: The input to the mgucell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the MGUCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns\n\nA tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.NASCell","page":"Cells","title":"RecurrentLayers.NASCell","text":"NASCell((input_size => hidden_size);\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nNeural Architecture Search unit. See NAS for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ntextFirst Layer Outputs \no_1 = sigma(W_i^(1) x_t + W_h^(1) h_t-1 + b^(1)) \no_2 = textReLU(W_i^(2) x_t + W_h^(2) h_t-1 + b^(2)) \no_3 = sigma(W_i^(3) x_t + W_h^(3) h_t-1 + b^(3)) \no_4 = textReLU(W_i^(4) x_t cdot W_h^(4) h_t-1) \no_5 = tanh(W_i^(5) x_t + W_h^(5) h_t-1 + b^(5)) \no_6 = sigma(W_i^(6) x_t + W_h^(6) h_t-1 + b^(6)) \no_7 = tanh(W_i^(7) x_t + W_h^(7) h_t-1 + b^(7)) \no_8 = sigma(W_i^(8) x_t + W_h^(8) h_t-1 + b^(8)) \n\ntextSecond Layer Computations \nl_1 = tanh(o_1 cdot o_2) \nl_2 = tanh(o_3 + o_4) \nl_3 = tanh(o_5 cdot o_6) \nl_4 = sigma(o_7 + o_8) \n\ntextInject Cell State \nl_1 = tanh(l_1 + c_textstate) \n\ntextFinal Layer Computations \nc_textnew = l_1 cdot l_2 \nl_5 = tanh(l_3 + l_4) \nh_textnew = tanh(c_textnew cdot l_5)\nendaligned\n\nForward\n\nnascell(inp, (state, cstate))\nnascell(inp)\n\nArguments\n\ninp: The input to the fastrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\n(state, cstate): A tuple containing the hidden and cell states of the NASCell. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros.\n\nReturns\n\nA tuple (output, state), where output = new_state is the new hidden state and state = (new_state, new_cstate) is the new hidden and cell state. They are tensors of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.RHNCell","page":"Cells","title":"RecurrentLayers.RHNCell","text":"RHNCell((input_size => hidden_size), depth=3;\n couple_carry::Bool = true,\n cell_kwargs...)\n\nRecurrent highway network. See RHNCellUnit for a the unit component of this layer. See RHN for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ndepth: depth of the recurrence. Default is 3\ncouple_carry: couples the carry gate and the transform gate. Default true\ninit_kernel: initializer for the input to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ns_ell^t = h_ell^t odot t_ell^t + s_ell-1^t odot c_ell^t \ntextwhere \nh_ell^t = tanh(W_h x^tmathbbI_ell = 1 + U_h_ell s_ell-1^t + b_h_ell) \nt_ell^t = sigma(W_t x^tmathbbI_ell = 1 + U_t_ell s_ell-1^t + b_t_ell) \nc_ell^t = sigma(W_c x^tmathbbI_ell = 1 + U_c_ell s_ell-1^t + b_c_ell)\nendaligned\n\nForward\n\nrnncell(inp, [state])\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.RHNCellUnit","page":"Cells","title":"RecurrentLayers.RHNCellUnit","text":"RHNCellUnit((input_size => hidden_size)::Pair;\n init_kernel = glorot_uniform,\n bias = true)\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.MUT1Cell","page":"Cells","title":"RecurrentLayers.MUT1Cell","text":"MUT1Cell((input_size => hidden_size);\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nMutated unit 1 cell. See MUT1 for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz = sigma(W_z x_t + b_z) \nr = sigma(W_r x_t + U_r h_t + b_r) \nh_t+1 = tanh(U_h (r odot h_t) + tanh(W_h x_t) + b_h) odot z \nquad + h_t odot (1 - z)\nendaligned\n\nForward\n\nmutcell(inp, state)\nmutcell(inp)\n\nArguments\n\ninp: The input to the mutcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the MUTCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns\n\nA tuple (output, state), where both elements are given by the updated state new_state, \n\na tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.MUT2Cell","page":"Cells","title":"RecurrentLayers.MUT2Cell","text":"MUT2Cell((input_size => hidden_size);\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nMutated unit 2 cell. See MUT2 for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz = sigma(W_z x_t + U_z h_t + b_z) \nr = sigma(x_t + U_r h_t + b_r) \nh_t+1 = tanh(U_h (r odot h_t) + W_h x_t + b_h) odot z \nquad + h_t odot (1 - z)\nendaligned\n\nForward\n\nmutcell(inp, state)\nmutcell(inp)\n\nArguments\n\ninp: The input to the mutcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the MUTCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns\n\nA tuple (output, state), where both elements are given by the updated state new_state, \n\na tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.MUT3Cell","page":"Cells","title":"RecurrentLayers.MUT3Cell","text":"MUT3Cell((input_size => hidden_size);\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nMutated unit 3 cell. See MUT3 for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz = sigma(W_z x_t + U_z tanh(h_t) + b_z) \nr = sigma(W_r x_t + U_r h_t + b_r) \nh_t+1 = tanh(U_h (r odot h_t) + W_h x_t + b_h) odot z \nquad + h_t odot (1 - z)\nendaligned\n\nForward\n\nmutcell(inp, state)\nmutcell(inp)\n\nArguments\n\ninp: The input to the mutcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the MUTCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns\n\nA tuple (output, state), where both elements are given by the updated state new_state, \n\na tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.SCRNCell","page":"Cells","title":"RecurrentLayers.SCRNCell","text":"SCRNCell((input_size => hidden_size)::Pair;\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true,\n alpha = 0.0)\n\nStructurally contraint recurrent unit. See SCRN for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\nalpha: structural contraint. Default is 0.0\n\nEquations\n\nbeginaligned\ns_t = (1 - alpha) W_s x_t + alpha s_t-1 \nh_t = sigma(W_h s_t + U_h h_t-1 + b_h) \ny_t = f(U_y h_t + W_y s_t)\nendaligned\n\nForward\n\nscrncell(inp, (state, cstate))\nscrncell(inp)\n\nArguments\n\ninp: The input to the scrncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\n(state, cstate): A tuple containing the hidden and cell states of the SCRNCell. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros.\n\nReturns\n\nA tuple (output, state), where output = new_state is the new hidden state and state = (new_state, new_cstate) is the new hidden and cell state. They are tensors of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.PeepholeLSTMCell","page":"Cells","title":"RecurrentLayers.PeepholeLSTMCell","text":"PeepholeLSTMCell((input_size => hidden_size)::Pair;\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nPeephole long short term memory cell. See PeepholeLSTM for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nf_t = sigma_g(W_f x_t + U_f c_t-1 + b_f) \ni_t = sigma_g(W_i x_t + U_i c_t-1 + b_i) \no_t = sigma_g(W_o x_t + U_o c_t-1 + b_o) \nc_t = f_t odot c_t-1 + i_t odot sigma_c(W_c x_t + b_c) \nh_t = o_t odot sigma_h(c_t)\nendaligned\n\nForward\n\npeepholelstmcell(inp, (state, cstate))\npeepholelstmcell(inp)\n\nArguments\n\ninp: The input to the peepholelstmcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\n(state, cstate): A tuple containing the hidden and cell states of the PeepholeLSTMCell. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros.\n\nReturns\n\nA tuple (output, state), where output = new_state is the new hidden state and state = (new_state, new_cstate) is the new hidden and cell state. They are tensors of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.FastRNNCell","page":"Cells","title":"RecurrentLayers.FastRNNCell","text":"FastRNNCell((input_size => hidden_size), [activation];\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nFast recurrent neural network cell. See FastRNN for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\nactivation: the activation function, defaults to tanh_fast\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ntildeh_t = sigma(W_h x_t + U_h h_t-1 + b) \nh_t = alpha tildeh_t + beta h_t-1\nendaligned\n\nForward\n\nfastrnncell(inp, state)\nfastrnncell(inp)\n\nArguments\n\ninp: The input to the fastrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the FastRNN. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns\n\nA tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.FastGRNNCell","page":"Cells","title":"RecurrentLayers.FastGRNNCell","text":"FastGRNNCell((input_size => hidden_size), [activation];\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nFast gated recurrent neural network cell. See FastGRNN for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\nactivation: the activation function, defaults to tanh_fast\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz_t = sigma(W_z x_t + U_z h_t-1 + b_z) \ntildeh_t = tanh(W_h x_t + U_h h_t-1 + b_h) \nh_t = big((zeta (1 - z_t) + nu) odot tildeh_tbig) + z_t odot h_t-1\nendaligned\n\nForward\n\nfastgrnncell(inp, state)\nfastgrnncell(inp)\n\nArguments\n\ninp: The input to the fastgrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the FastGRNN. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns\n\nA tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/layers/#Cell-wrappers","page":"Layers","title":"Cell wrappers","text":"","category":"section"},{"location":"api/layers/","page":"Layers","title":"Layers","text":"RAN\nIndRNN\nLightRU\nLiGRU\nMGU\nNAS\nRHN\nMUT1\nMUT2\nMUT3\nSCRN\nPeepholeLSTM\nFastRNN\nFastGRNN","category":"page"},{"location":"api/layers/#RecurrentLayers.RAN","page":"Layers","title":"RecurrentLayers.RAN","text":"RAN(input_size => hidden_size; kwargs...)\n\nThe RANCell, introduced in this paper, is a recurrent cell layer which provides additional memory through the use of gates.\n\nand returns both ht anf ct.\n\nSee RANCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ntildec_t = W_c x_t \ni_t = sigma(W_i x_t + U_i h_t-1 + b_i) \nf_t = sigma(W_f x_t + U_f h_t-1 + b_f) \nc_t = i_t odot tildec_t + f_t odot c_t-1 \nh_t = g(c_t)\nendaligned\n\nForward\n\nran(inp, (state, cstate))\nran(inp)\n\nArguments\n\ninp: The input to the ran. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\n(state, cstate): A tuple containing the hidden and cell states of the RAN. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros\n\nReturns\n\nNew hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/layers/#RecurrentLayers.IndRNN","page":"Layers","title":"RecurrentLayers.IndRNN","text":"IndRNN((input_size, hidden_size)::Pair, σ = tanh, σ=relu;\n kwargs...)\n\nIndependently recurrent network. See IndRNNCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\nσ: activation function. Default is tanh\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nmathbfh_t = sigma(mathbfW mathbfx_t + mathbfu odot mathbfh_t-1 + mathbfb)\n\nForward\n\nindrnn(inp, state)\nindrnn(inp)\n\nArguments\n\ninp: The input to the indrnn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\nstate: The hidden state of the IndRNN. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns\n\nNew hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/layers/#RecurrentLayers.LightRU","page":"Layers","title":"RecurrentLayers.LightRU","text":"LightRU((input_size => hidden_size)::Pair; kwargs...)\n\nLight recurrent unit network. See LightRUCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ntildeh_t = tanh(W_h x_t) \nf_t = delta(W_f x_t + U_f h_t-1 + b_f) \nh_t = (1 - f_t) odot h_t-1 + f_t odot tildeh_t\nendaligned\n\nForward\n\nlightru(inp, state)\nlightru(inp)\n\nArguments\n\ninp: The input to the lightru. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\nstate: The hidden state of the LightRU. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns\n\nNew hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/layers/#RecurrentLayers.LiGRU","page":"Layers","title":"RecurrentLayers.LiGRU","text":"LiGRU((input_size => hidden_size)::Pair; kwargs...)\n\nLight gated recurrent network. The implementation does not include the batch normalization as described in the original paper. See LiGRUCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz_t = sigma(W_z x_t + U_z h_t-1) \ntildeh_t = textReLU(W_h x_t + U_h h_t-1) \nh_t = z_t odot h_t-1 + (1 - z_t) odot tildeh_t\nendaligned\n\nForward\n\nligru(inp, state)\nligru(inp)\n\nArguments\n\ninp: The input to the ligru. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\nstate: The hidden state of the LiGRU. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns\n\nNew hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/layers/#RecurrentLayers.MGU","page":"Layers","title":"RecurrentLayers.MGU","text":"MGU((input_size => hidden_size)::Pair; kwargs...)\n\nMinimal gated unit network. See MGUCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nf_t = sigma(W_f x_t + U_f h_t-1 + b_f) \ntildeh_t = tanh(W_h x_t + U_h (f_t odot h_t-1) + b_h) \nh_t = (1 - f_t) odot h_t-1 + f_t odot tildeh_t\nendaligned\n\nForward\n\nmgu(inp, state)\nmgu(inp)\n\nArguments\n\ninp: The input to the mgu. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\nstate: The hidden state of the MGU. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns\n\nNew hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/layers/#RecurrentLayers.NAS","page":"Layers","title":"RecurrentLayers.NAS","text":"NAS((input_size => hidden_size)::Pair; kwargs...)\n\nNeural Architecture Search unit. See NASCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ntextFirst Layer Outputs \no_1 = sigma(W_i^(1) x_t + W_h^(1) h_t-1 + b^(1)) \no_2 = textReLU(W_i^(2) x_t + W_h^(2) h_t-1 + b^(2)) \no_3 = sigma(W_i^(3) x_t + W_h^(3) h_t-1 + b^(3)) \no_4 = textReLU(W_i^(4) x_t cdot W_h^(4) h_t-1) \no_5 = tanh(W_i^(5) x_t + W_h^(5) h_t-1 + b^(5)) \no_6 = sigma(W_i^(6) x_t + W_h^(6) h_t-1 + b^(6)) \no_7 = tanh(W_i^(7) x_t + W_h^(7) h_t-1 + b^(7)) \no_8 = sigma(W_i^(8) x_t + W_h^(8) h_t-1 + b^(8)) \n\ntextSecond Layer Computations \nl_1 = tanh(o_1 cdot o_2) \nl_2 = tanh(o_3 + o_4) \nl_3 = tanh(o_5 cdot o_6) \nl_4 = sigma(o_7 + o_8) \n\ntextInject Cell State \nl_1 = tanh(l_1 + c_textstate) \n\ntextFinal Layer Computations \nc_textnew = l_1 cdot l_2 \nl_5 = tanh(l_3 + l_4) \nh_textnew = tanh(c_textnew cdot l_5)\nendaligned\n\nForward\n\nnas(inp, (state, cstate))\nnas(inp)\n\nArguments\n\ninp: The input to the nas. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\n(state, cstate): A tuple containing the hidden and cell states of the NAS. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros\n\nReturns\n\nNew hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/layers/#RecurrentLayers.RHN","page":"Layers","title":"RecurrentLayers.RHN","text":"RHN((input_size => hidden_size)::Pair depth=3; kwargs...)\n\nRecurrent highway network. See RHNCellUnit for a the unit component of this layer. See RHNCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ndepth: depth of the recurrence. Default is 3\ncouple_carry: couples the carry gate and the transform gate. Default true\ninit_kernel: initializer for the input to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ns_ell^t = h_ell^t odot t_ell^t + s_ell-1^t odot c_ell^t \ntextwhere \nh_ell^t = tanh(W_h x^tmathbbI_ell = 1 + U_h_ell s_ell-1^t + b_h_ell) \nt_ell^t = sigma(W_t x^tmathbbI_ell = 1 + U_t_ell s_ell-1^t + b_t_ell) \nc_ell^t = sigma(W_c x^tmathbbI_ell = 1 + U_c_ell s_ell-1^t + b_c_ell)\nendaligned\n\n\n\n\n\n","category":"type"},{"location":"api/layers/#RecurrentLayers.MUT1","page":"Layers","title":"RecurrentLayers.MUT1","text":"MUT1((input_size => hidden_size); kwargs...)\n\nMutated unit 1 network. See MUT1Cell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz = sigma(W_z x_t + b_z) \nr = sigma(W_r x_t + U_r h_t + b_r) \nh_t+1 = tanh(U_h (r odot h_t) + tanh(W_h x_t) + b_h) odot z \nquad + h_t odot (1 - z)\nendaligned\n\nForward\n\nmut(inp, state)\nmut(inp)\n\nArguments\n\ninp: The input to the mut. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\nstate: The hidden state of the MUT. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns\n\nNew hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/layers/#RecurrentLayers.MUT2","page":"Layers","title":"RecurrentLayers.MUT2","text":"MUT2Cell((input_size => hidden_size); kwargs...)\n\nMutated unit 2 network. See MUT2Cell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz = sigma(W_z x_t + U_z h_t + b_z) \nr = sigma(x_t + U_r h_t + b_r) \nh_t+1 = tanh(U_h (r odot h_t) + W_h x_t + b_h) odot z \nquad + h_t odot (1 - z)\nendaligned\n\nForward\n\nmut(inp, state)\nmut(inp)\n\nArguments\n\ninp: The input to the mut. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\nstate: The hidden state of the MUT. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns\n\nNew hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/layers/#RecurrentLayers.MUT3","page":"Layers","title":"RecurrentLayers.MUT3","text":"MUT3((input_size => hidden_size); kwargs...)\n\nMutated unit 3 network. See MUT3Cell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz = sigma(W_z x_t + U_z tanh(h_t) + b_z) \nr = sigma(W_r x_t + U_r h_t + b_r) \nh_t+1 = tanh(U_h (r odot h_t) + W_h x_t + b_h) odot z \nquad + h_t odot (1 - z)\nendaligned\n\nForward\n\nmut(inp, state)\nmut(inp)\n\nArguments\n\ninp: The input to the mut. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\nstate: The hidden state of the MUT. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns\n\nNew hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/layers/#RecurrentLayers.SCRN","page":"Layers","title":"RecurrentLayers.SCRN","text":"SCRN((input_size => hidden_size)::Pair;\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true,\n alpha = 0.0)\n\nStructurally contraint recurrent unit. See SCRNCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\nalpha: structural contraint. Default is 0.0\n\nEquations\n\nbeginaligned\ns_t = (1 - alpha) W_s x_t + alpha s_t-1 \nh_t = sigma(W_h s_t + U_h h_t-1 + b_h) \ny_t = f(U_y h_t + W_y s_t)\nendaligned\n\nForward\n\nscrn(inp, (state, cstate))\nscrn(inp)\n\nArguments\n\ninp: The input to the scrn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\n(state, cstate): A tuple containing the hidden and cell states of the SCRN. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros\n\nReturns\n\nNew hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/layers/#RecurrentLayers.PeepholeLSTM","page":"Layers","title":"RecurrentLayers.PeepholeLSTM","text":"PeepholeLSTM((input_size => hidden_size)::Pair; kwargs...)\n\nPeephole long short term memory network. See PeepholeLSTMCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginalign\nf_t = sigma_g(W_f x_t + U_f c_t-1 + b_f) \ni_t = sigma_g(W_i x_t + U_i c_t-1 + b_i) \no_t = sigma_g(W_o x_t + U_o c_t-1 + b_o) \nc_t = f_t odot c_t-1 + i_t odot sigma_c(W_c x_t + b_c) \nh_t = o_t odot sigma_h(c_t)\nendalign\n\nForward\n\npeepholelstm(inp, (state, cstate))\npeepholelstm(inp)\n\nArguments\n\ninp: The input to the peepholelstm. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\n(state, cstate): A tuple containing the hidden and cell states of the PeepholeLSTM. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros\n\nReturns\n\nNew hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/layers/#RecurrentLayers.FastRNN","page":"Layers","title":"RecurrentLayers.FastRNN","text":"FastRNN((input_size => hidden_size), [activation]; kwargs...)\n\nFast recurrent neural network. See FastRNNCell for a layer that processes a single sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\nactivation: the activation function, defaults to tanh_fast\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ntildeh_t = sigma(W_h x_t + U_h h_t-1 + b) \nh_t = alpha tildeh_t + beta h_t-1\nendaligned\n\nForward\n\nfastrnn(inp, state)\nfastrnn(inp)\n\nArguments\n\ninp: The input to the fastrnn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\nstate: The hidden state of the FastRNN. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns\n\nNew hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/layers/#RecurrentLayers.FastGRNN","page":"Layers","title":"RecurrentLayers.FastGRNN","text":"FastGRNN((input_size => hidden_size), [activation]; kwargs...)\n\nFast recurrent neural network. See FastGRNNCell for a layer that processes a single sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\nactivation: the activation function, defaults to tanh_fast\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz_t = sigma(W_z x_t + U_z h_t-1 + b_z) \ntildeh_t = tanh(W_h x_t + U_h h_t-1 + b_h) \nh_t = big((zeta (1 - z_t) + nu) odot tildeh_tbig) + z_t odot h_t-1\nendaligned\n\nForward\n\nfastgrnn(inp, state)\nfastgrnn(inp)\n\nArguments\n\ninp: The input to the fastgrnn. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the FastGRNN. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros.\n\nReturns\n\nNew hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"roadmap/#Roadmap","page":"Roadmap","title":"Roadmap","text":"","category":"section"},{"location":"roadmap/","page":"Roadmap","title":"Roadmap","text":"This page documents some planned work for RecurrentLayers.jl. Future work for this library includes additional cells such as:","category":"page"},{"location":"roadmap/","page":"Roadmap","title":"Roadmap","text":"FastRNNs and FastGRUs (current focus) arxiv\nUnitary recurrent neural networks arxiv\nModern recurrent neural networks such as LRU and minLSTM/minGRU\nQuasi recurrent neural networks arxiv","category":"page"},{"location":"roadmap/","page":"Roadmap","title":"Roadmap","text":"Additionally, some cell-independent architectures are also planned, that expand the ability of recurrent architectures and could theoretically take any cell:","category":"page"},{"location":"roadmap/","page":"Roadmap","title":"Roadmap","text":"Clockwork rnns arxiv\nPhased rnns arxiv\nSegment rnn arxiv\nFast-Slow rnns arxiv","category":"page"},{"location":"roadmap/","page":"Roadmap","title":"Roadmap","text":"An implementation of these ideally would be, for example FastSlow(RNNCell, input_size => hidden_size). More details on this soon!","category":"page"},{"location":"","page":"Home","title":"Home","text":"CurrentModule = RecurrentLayers","category":"page"},{"location":"#RecurrentLayers","page":"Home","title":"RecurrentLayers","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"RecurrentLayers.jl extends Flux.jl recurrent layers offering by providing implementations of bleeding edge recurrent layers not commonly available in base deep learning libraries. It is designed for a seamless integration with the larger Flux ecosystem, enabling researchers and practitioners to leverage the latest developments in recurrent neural networks.","category":"page"},{"location":"#Implemented-layers","page":"Home","title":"Implemented layers","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Minimal gated unit as MGUCell arxiv\nLight gated recurrent unit as LiGRUCell arxiv\nIndependently recurrent neural networks as IndRNNCell arxiv\nRecurrent addictive networks as RANCell arxiv\nRecurrent highway network as RHNCell arixv\nLight recurrent unit as LightRUCell pub\nNeural architecture search unit NASCell arxiv\nEvolving recurrent neural networks as MUT1Cell, MUT2Cell, MUT3Cell pub\nStructurally constrained recurrent neural network as SCRNCell arxiv\nPeephole long short term memory as PeepholeLSTMCell pub\nFastRNNCell and FastGRNNCell arxiv","category":"page"},{"location":"#Contributing","page":"Home","title":"Contributing","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Contributions are always welcome! We specifically look for :","category":"page"},{"location":"","page":"Home","title":"Home","text":"Recurrent cells you would like to see implemented \nBenchmarks\nAny bugs and mistakes of course!\nDocumentation, in any form: examples, how tos, docstrings ","category":"page"}] +[{"location":"api/cells/#Cells","page":"Cells","title":"Cells","text":"","category":"section"},{"location":"api/cells/","page":"Cells","title":"Cells","text":"RANCell\nIndRNNCell\nLightRUCell\nLiGRUCell\nMGUCell\nNASCell\nRHNCell\nRHNCellUnit\nMUT1Cell\nMUT2Cell\nMUT3Cell\nSCRNCell\nPeepholeLSTMCell\nFastRNNCell\nFastGRNNCell","category":"page"},{"location":"api/cells/#RecurrentLayers.RANCell","page":"Cells","title":"RecurrentLayers.RANCell","text":"RANCell((input_size => hidden_size)::Pair;\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nThe RANCell, introduced in this paper, is a recurrent cell layer which provides additional memory through the use of gates.\n\nSee RAN for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ntildec_t = W_c x_t \ni_t = sigma(W_i x_t + U_i h_t-1 + b_i) \nf_t = sigma(W_f x_t + U_f h_t-1 + b_f) \nc_t = i_t odot tildec_t + f_t odot c_t-1 \nh_t = g(c_t)\nendaligned\n\nForward\n\nrancell(inp, (state, cstate))\nrancell(inp)\n\nArguments\n\ninp: The input to the rancell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\n(state, cstate): A tuple containing the hidden and cell states of the RANCell. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.\n\nReturns\n\nA tuple (output, state), where output = new_state is the new hidden state and state = (new_state, new_cstate) is the new hidden and cell state. They are tensors of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.IndRNNCell","page":"Cells","title":"RecurrentLayers.IndRNNCell","text":"IndRNNCell((input_size => hidden_size)::Pair, σ=relu;\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nIndependently recurrent cell. See IndRNN for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\nσ: activation function. Default is tanh\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nmathbfh_t = sigma(mathbfW mathbfx_t + mathbfu odot mathbfh_t-1 + mathbfb)\n\nForward\n\nindrnncell(inp, state)\nindrnncell(inp)\n\nArguments\n\ninp: The input to the indrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the IndRNNCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.\n\nReturns\n\nA tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.LightRUCell","page":"Cells","title":"RecurrentLayers.LightRUCell","text":"LightRUCell((input_size => hidden_size)::Pair;\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nLight recurrent unit. See LightRU for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ntildeh_t = tanh(W_h x_t) \nf_t = delta(W_f x_t + U_f h_t-1 + b_f) \nh_t = (1 - f_t) odot h_t-1 + f_t odot tildeh_t\nendaligned\n\nForward\n\nlightrucell(inp, state)\nlightrucell(inp)\n\nArguments\n\ninp: The input to the lightrucell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the LightRUCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.\n\nReturns\n\nA tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.LiGRUCell","page":"Cells","title":"RecurrentLayers.LiGRUCell","text":"LiGRUCell((input_size => hidden_size)::Pair;\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nLight gated recurrent unit. The implementation does not include the batch normalization as described in the original paper. See LiGRU for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz_t = sigma(W_z x_t + U_z h_t-1) \ntildeh_t = textReLU(W_h x_t + U_h h_t-1) \nh_t = z_t odot h_t-1 + (1 - z_t) odot tildeh_t\nendaligned\n\nForward\n\nligrucell(inp, state)\nligrucell(inp)\n\nArguments\n\ninp: The input to the ligrucell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the LiGRUCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.\n\nReturns\n\nA tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.MGUCell","page":"Cells","title":"RecurrentLayers.MGUCell","text":"MGUCell((input_size => hidden_size)::Pair;\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nMinimal gated unit. See MGU for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nf_t = sigma(W_f x_t + U_f h_t-1 + b_f) \ntildeh_t = tanh(W_h x_t + U_h (f_t odot h_t-1) + b_h) \nh_t = (1 - f_t) odot h_t-1 + f_t odot tildeh_t\nendaligned\n\nForward\n\nmgucell(inp, state)\nmgucell(inp)\n\nArguments\n\ninp: The input to the mgucell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the MGUCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.\n\nReturns\n\nA tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.NASCell","page":"Cells","title":"RecurrentLayers.NASCell","text":"NASCell((input_size => hidden_size);\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nNeural Architecture Search unit. See NAS for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ntextFirst Layer Outputs \no_1 = sigma(W_i^(1) x_t + W_h^(1) h_t-1 + b^(1)) \no_2 = textReLU(W_i^(2) x_t + W_h^(2) h_t-1 + b^(2)) \no_3 = sigma(W_i^(3) x_t + W_h^(3) h_t-1 + b^(3)) \no_4 = textReLU(W_i^(4) x_t cdot W_h^(4) h_t-1) \no_5 = tanh(W_i^(5) x_t + W_h^(5) h_t-1 + b^(5)) \no_6 = sigma(W_i^(6) x_t + W_h^(6) h_t-1 + b^(6)) \no_7 = tanh(W_i^(7) x_t + W_h^(7) h_t-1 + b^(7)) \no_8 = sigma(W_i^(8) x_t + W_h^(8) h_t-1 + b^(8)) \n\ntextSecond Layer Computations \nl_1 = tanh(o_1 cdot o_2) \nl_2 = tanh(o_3 + o_4) \nl_3 = tanh(o_5 cdot o_6) \nl_4 = sigma(o_7 + o_8) \n\ntextInject Cell State \nl_1 = tanh(l_1 + c_textstate) \n\ntextFinal Layer Computations \nc_textnew = l_1 cdot l_2 \nl_5 = tanh(l_3 + l_4) \nh_textnew = tanh(c_textnew cdot l_5)\nendaligned\n\nForward\n\nnascell(inp, (state, cstate))\nnascell(inp)\n\nArguments\n\ninp: The input to the fastrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\n(state, cstate): A tuple containing the hidden and cell states of the NASCell. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.\n\nReturns\n\nA tuple (output, state), where output = new_state is the new hidden state and state = (new_state, new_cstate) is the new hidden and cell state. They are tensors of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.RHNCell","page":"Cells","title":"RecurrentLayers.RHNCell","text":"RHNCell((input_size => hidden_size), depth=3;\n couple_carry::Bool = true,\n cell_kwargs...)\n\nRecurrent highway network. See RHNCellUnit for a the unit component of this layer. See RHN for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ndepth: depth of the recurrence. Default is 3\ncouple_carry: couples the carry gate and the transform gate. Default true\ninit_kernel: initializer for the input to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ns_ell^t = h_ell^t odot t_ell^t + s_ell-1^t odot c_ell^t \ntextwhere \nh_ell^t = tanh(W_h x^tmathbbI_ell = 1 + U_h_ell s_ell-1^t + b_h_ell) \nt_ell^t = sigma(W_t x^tmathbbI_ell = 1 + U_t_ell s_ell-1^t + b_t_ell) \nc_ell^t = sigma(W_c x^tmathbbI_ell = 1 + U_c_ell s_ell-1^t + b_c_ell)\nendaligned\n\nForward\n\nrnncell(inp, [state])\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.RHNCellUnit","page":"Cells","title":"RecurrentLayers.RHNCellUnit","text":"RHNCellUnit((input_size => hidden_size)::Pair;\n init_kernel = glorot_uniform,\n bias = true)\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.MUT1Cell","page":"Cells","title":"RecurrentLayers.MUT1Cell","text":"MUT1Cell((input_size => hidden_size);\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nMutated unit 1 cell. See MUT1 for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz = sigma(W_z x_t + b_z) \nr = sigma(W_r x_t + U_r h_t + b_r) \nh_t+1 = tanh(U_h (r odot h_t) + tanh(W_h x_t) + b_h) odot z \nquad + h_t odot (1 - z)\nendaligned\n\nForward\n\nmutcell(inp, state)\nmutcell(inp)\n\nArguments\n\ninp: The input to the mutcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the MUTCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.\n\nReturns\n\nA tuple (output, state), where both elements are given by the updated state new_state, \n\na tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.MUT2Cell","page":"Cells","title":"RecurrentLayers.MUT2Cell","text":"MUT2Cell((input_size => hidden_size);\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nMutated unit 2 cell. See MUT2 for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz = sigma(W_z x_t + U_z h_t + b_z) \nr = sigma(x_t + U_r h_t + b_r) \nh_t+1 = tanh(U_h (r odot h_t) + W_h x_t + b_h) odot z \nquad + h_t odot (1 - z)\nendaligned\n\nForward\n\nmutcell(inp, state)\nmutcell(inp)\n\nArguments\n\ninp: The input to the mutcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the MUTCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.\n\nReturns\n\nA tuple (output, state), where both elements are given by the updated state new_state, \n\na tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.MUT3Cell","page":"Cells","title":"RecurrentLayers.MUT3Cell","text":"MUT3Cell((input_size => hidden_size);\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nMutated unit 3 cell. See MUT3 for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz = sigma(W_z x_t + U_z tanh(h_t) + b_z) \nr = sigma(W_r x_t + U_r h_t + b_r) \nh_t+1 = tanh(U_h (r odot h_t) + W_h x_t + b_h) odot z \nquad + h_t odot (1 - z)\nendaligned\n\nForward\n\nmutcell(inp, state)\nmutcell(inp)\n\nArguments\n\ninp: The input to the mutcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the MUTCell. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.\n\nReturns\n\nA tuple (output, state), where both elements are given by the updated state new_state, \n\na tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.SCRNCell","page":"Cells","title":"RecurrentLayers.SCRNCell","text":"SCRNCell((input_size => hidden_size)::Pair;\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true,\n alpha = 0.0)\n\nStructurally contraint recurrent unit. See SCRN for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\nalpha: structural contraint. Default is 0.0\n\nEquations\n\nbeginaligned\ns_t = (1 - alpha) W_s x_t + alpha s_t-1 \nh_t = sigma(W_h s_t + U_h h_t-1 + b_h) \ny_t = f(U_y h_t + W_y s_t)\nendaligned\n\nForward\n\nscrncell(inp, (state, cstate))\nscrncell(inp)\n\nArguments\n\ninp: The input to the scrncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\n(state, cstate): A tuple containing the hidden and cell states of the SCRNCell. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.\n\nReturns\n\nA tuple (output, state), where output = new_state is the new hidden state and state = (new_state, new_cstate) is the new hidden and cell state. They are tensors of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.PeepholeLSTMCell","page":"Cells","title":"RecurrentLayers.PeepholeLSTMCell","text":"PeepholeLSTMCell((input_size => hidden_size)::Pair;\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nPeephole long short term memory cell. See PeepholeLSTM for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nf_t = sigma_g(W_f x_t + U_f c_t-1 + b_f) \ni_t = sigma_g(W_i x_t + U_i c_t-1 + b_i) \no_t = sigma_g(W_o x_t + U_o c_t-1 + b_o) \nc_t = f_t odot c_t-1 + i_t odot sigma_c(W_c x_t + b_c) \nh_t = o_t odot sigma_h(c_t)\nendaligned\n\nForward\n\npeepholelstmcell(inp, (state, cstate))\npeepholelstmcell(inp)\n\nArguments\n\ninp: The input to the peepholelstmcell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\n(state, cstate): A tuple containing the hidden and cell states of the PeepholeLSTMCell. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.\n\nReturns\n\nA tuple (output, state), where output = new_state is the new hidden state and state = (new_state, new_cstate) is the new hidden and cell state. They are tensors of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.FastRNNCell","page":"Cells","title":"RecurrentLayers.FastRNNCell","text":"FastRNNCell((input_size => hidden_size), [activation];\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nFast recurrent neural network cell. See FastRNN for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\nactivation: the activation function, defaults to tanh_fast\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ntildeh_t = sigma(W_h x_t + U_h h_t-1 + b) \nh_t = alpha tildeh_t + beta h_t-1\nendaligned\n\nForward\n\nfastrnncell(inp, state)\nfastrnncell(inp)\n\nArguments\n\ninp: The input to the fastrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the FastRNN. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.\n\nReturns\n\nA tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/cells/#RecurrentLayers.FastGRNNCell","page":"Cells","title":"RecurrentLayers.FastGRNNCell","text":"FastGRNNCell((input_size => hidden_size), [activation];\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true)\n\nFast gated recurrent neural network cell. See FastGRNN for a layer that processes entire sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\nactivation: the activation function, defaults to tanh_fast\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz_t = sigma(W_z x_t + U_z h_t-1 + b_z) \ntildeh_t = tanh(W_h x_t + U_h h_t-1 + b_h) \nh_t = big((zeta (1 - z_t) + nu) odot tildeh_tbig) + z_t odot h_t-1\nendaligned\n\nForward\n\nfastgrnncell(inp, state)\nfastgrnncell(inp)\n\nArguments\n\ninp: The input to the fastgrnncell. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the FastGRNN. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.\n\nReturns\n\nA tuple (output, state), where both elements are given by the updated state new_state, a tensor of size hidden_size or hidden_size x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/layers/#Cell-wrappers","page":"Layers","title":"Cell wrappers","text":"","category":"section"},{"location":"api/layers/","page":"Layers","title":"Layers","text":"RAN\nIndRNN\nLightRU\nLiGRU\nMGU\nNAS\nRHN\nMUT1\nMUT2\nMUT3\nSCRN\nPeepholeLSTM\nFastRNN\nFastGRNN","category":"page"},{"location":"api/layers/#RecurrentLayers.RAN","page":"Layers","title":"RecurrentLayers.RAN","text":"RAN(input_size => hidden_size; kwargs...)\n\nThe RANCell, introduced in this paper, is a recurrent cell layer which provides additional memory through the use of gates.\n\nand returns both ht anf ct.\n\nSee RANCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ntildec_t = W_c x_t \ni_t = sigma(W_i x_t + U_i h_t-1 + b_i) \nf_t = sigma(W_f x_t + U_f h_t-1 + b_f) \nc_t = i_t odot tildec_t + f_t odot c_t-1 \nh_t = g(c_t)\nendaligned\n\nForward\n\nran(inp, (state, cstate))\nran(inp)\n\nArguments\n\ninp: The input to the ran. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\n(state, cstate): A tuple containing the hidden and cell states of the RAN. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.\n\nReturns\n\nNew hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/layers/#RecurrentLayers.IndRNN","page":"Layers","title":"RecurrentLayers.IndRNN","text":"IndRNN((input_size, hidden_size)::Pair, σ = tanh, σ=relu;\n kwargs...)\n\nIndependently recurrent network. See IndRNNCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\nσ: activation function. Default is tanh\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nmathbfh_t = sigma(mathbfW mathbfx_t + mathbfu odot mathbfh_t-1 + mathbfb)\n\nForward\n\nindrnn(inp, state)\nindrnn(inp)\n\nArguments\n\ninp: The input to the indrnn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\nstate: The hidden state of the IndRNN. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.\n\nReturns\n\nNew hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/layers/#RecurrentLayers.LightRU","page":"Layers","title":"RecurrentLayers.LightRU","text":"LightRU((input_size => hidden_size)::Pair; kwargs...)\n\nLight recurrent unit network. See LightRUCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ntildeh_t = tanh(W_h x_t) \nf_t = delta(W_f x_t + U_f h_t-1 + b_f) \nh_t = (1 - f_t) odot h_t-1 + f_t odot tildeh_t\nendaligned\n\nForward\n\nlightru(inp, state)\nlightru(inp)\n\nArguments\n\ninp: The input to the lightru. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\nstate: The hidden state of the LightRU. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.\n\nReturns\n\nNew hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/layers/#RecurrentLayers.LiGRU","page":"Layers","title":"RecurrentLayers.LiGRU","text":"LiGRU((input_size => hidden_size)::Pair; kwargs...)\n\nLight gated recurrent network. The implementation does not include the batch normalization as described in the original paper. See LiGRUCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz_t = sigma(W_z x_t + U_z h_t-1) \ntildeh_t = textReLU(W_h x_t + U_h h_t-1) \nh_t = z_t odot h_t-1 + (1 - z_t) odot tildeh_t\nendaligned\n\nForward\n\nligru(inp, state)\nligru(inp)\n\nArguments\n\ninp: The input to the ligru. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\nstate: The hidden state of the LiGRU. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.\n\nReturns\n\nNew hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/layers/#RecurrentLayers.MGU","page":"Layers","title":"RecurrentLayers.MGU","text":"MGU((input_size => hidden_size)::Pair; kwargs...)\n\nMinimal gated unit network. See MGUCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nf_t = sigma(W_f x_t + U_f h_t-1 + b_f) \ntildeh_t = tanh(W_h x_t + U_h (f_t odot h_t-1) + b_h) \nh_t = (1 - f_t) odot h_t-1 + f_t odot tildeh_t\nendaligned\n\nForward\n\nmgu(inp, state)\nmgu(inp)\n\nArguments\n\ninp: The input to the mgu. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\nstate: The hidden state of the MGU. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.\n\nReturns\n\nNew hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/layers/#RecurrentLayers.NAS","page":"Layers","title":"RecurrentLayers.NAS","text":"NAS((input_size => hidden_size)::Pair; kwargs...)\n\nNeural Architecture Search unit. See NASCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ntextFirst Layer Outputs \no_1 = sigma(W_i^(1) x_t + W_h^(1) h_t-1 + b^(1)) \no_2 = textReLU(W_i^(2) x_t + W_h^(2) h_t-1 + b^(2)) \no_3 = sigma(W_i^(3) x_t + W_h^(3) h_t-1 + b^(3)) \no_4 = textReLU(W_i^(4) x_t cdot W_h^(4) h_t-1) \no_5 = tanh(W_i^(5) x_t + W_h^(5) h_t-1 + b^(5)) \no_6 = sigma(W_i^(6) x_t + W_h^(6) h_t-1 + b^(6)) \no_7 = tanh(W_i^(7) x_t + W_h^(7) h_t-1 + b^(7)) \no_8 = sigma(W_i^(8) x_t + W_h^(8) h_t-1 + b^(8)) \n\ntextSecond Layer Computations \nl_1 = tanh(o_1 cdot o_2) \nl_2 = tanh(o_3 + o_4) \nl_3 = tanh(o_5 cdot o_6) \nl_4 = sigma(o_7 + o_8) \n\ntextInject Cell State \nl_1 = tanh(l_1 + c_textstate) \n\ntextFinal Layer Computations \nc_textnew = l_1 cdot l_2 \nl_5 = tanh(l_3 + l_4) \nh_textnew = tanh(c_textnew cdot l_5)\nendaligned\n\nForward\n\nnas(inp, (state, cstate))\nnas(inp)\n\nArguments\n\ninp: The input to the nas. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\n(state, cstate): A tuple containing the hidden and cell states of the NAS. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.\n\nReturns\n\nNew hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/layers/#RecurrentLayers.RHN","page":"Layers","title":"RecurrentLayers.RHN","text":"RHN((input_size => hidden_size)::Pair depth=3; kwargs...)\n\nRecurrent highway network. See RHNCellUnit for a the unit component of this layer. See RHNCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ndepth: depth of the recurrence. Default is 3\ncouple_carry: couples the carry gate and the transform gate. Default true\ninit_kernel: initializer for the input to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ns_ell^t = h_ell^t odot t_ell^t + s_ell-1^t odot c_ell^t \ntextwhere \nh_ell^t = tanh(W_h x^tmathbbI_ell = 1 + U_h_ell s_ell-1^t + b_h_ell) \nt_ell^t = sigma(W_t x^tmathbbI_ell = 1 + U_t_ell s_ell-1^t + b_t_ell) \nc_ell^t = sigma(W_c x^tmathbbI_ell = 1 + U_c_ell s_ell-1^t + b_c_ell)\nendaligned\n\n\n\n\n\n","category":"type"},{"location":"api/layers/#RecurrentLayers.MUT1","page":"Layers","title":"RecurrentLayers.MUT1","text":"MUT1((input_size => hidden_size); kwargs...)\n\nMutated unit 1 network. See MUT1Cell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz = sigma(W_z x_t + b_z) \nr = sigma(W_r x_t + U_r h_t + b_r) \nh_t+1 = tanh(U_h (r odot h_t) + tanh(W_h x_t) + b_h) odot z \nquad + h_t odot (1 - z)\nendaligned\n\nForward\n\nmut(inp, state)\nmut(inp)\n\nArguments\n\ninp: The input to the mut. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\nstate: The hidden state of the MUT. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.\n\nReturns\n\nNew hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/layers/#RecurrentLayers.MUT2","page":"Layers","title":"RecurrentLayers.MUT2","text":"MUT2Cell((input_size => hidden_size); kwargs...)\n\nMutated unit 2 network. See MUT2Cell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz = sigma(W_z x_t + U_z h_t + b_z) \nr = sigma(x_t + U_r h_t + b_r) \nh_t+1 = tanh(U_h (r odot h_t) + W_h x_t + b_h) odot z \nquad + h_t odot (1 - z)\nendaligned\n\nForward\n\nmut(inp, state)\nmut(inp)\n\nArguments\n\ninp: The input to the mut. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\nstate: The hidden state of the MUT. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.\n\nReturns\n\nNew hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/layers/#RecurrentLayers.MUT3","page":"Layers","title":"RecurrentLayers.MUT3","text":"MUT3((input_size => hidden_size); kwargs...)\n\nMutated unit 3 network. See MUT3Cell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz = sigma(W_z x_t + U_z tanh(h_t) + b_z) \nr = sigma(W_r x_t + U_r h_t + b_r) \nh_t+1 = tanh(U_h (r odot h_t) + W_h x_t + b_h) odot z \nquad + h_t odot (1 - z)\nendaligned\n\nForward\n\nmut(inp, state)\nmut(inp)\n\nArguments\n\ninp: The input to the mut. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\nstate: The hidden state of the MUT. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.\n\nReturns\n\nNew hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/layers/#RecurrentLayers.SCRN","page":"Layers","title":"RecurrentLayers.SCRN","text":"SCRN((input_size => hidden_size)::Pair;\n init_kernel = glorot_uniform,\n init_recurrent_kernel = glorot_uniform,\n bias = true,\n alpha = 0.0)\n\nStructurally contraint recurrent unit. See SCRNCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\nalpha: structural contraint. Default is 0.0\n\nEquations\n\nbeginaligned\ns_t = (1 - alpha) W_s x_t + alpha s_t-1 \nh_t = sigma(W_h s_t + U_h h_t-1 + b_h) \ny_t = f(U_y h_t + W_y s_t)\nendaligned\n\nForward\n\nscrn(inp, (state, cstate))\nscrn(inp)\n\nArguments\n\ninp: The input to the scrn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\n(state, cstate): A tuple containing the hidden and cell states of the SCRN. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.\n\nReturns\n\nNew hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/layers/#RecurrentLayers.PeepholeLSTM","page":"Layers","title":"RecurrentLayers.PeepholeLSTM","text":"PeepholeLSTM((input_size => hidden_size)::Pair; kwargs...)\n\nPeephole long short term memory network. See PeepholeLSTMCell for a layer that processes a single sequence.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginalign\nf_t = sigma_g(W_f x_t + U_f c_t-1 + b_f) \ni_t = sigma_g(W_i x_t + U_i c_t-1 + b_i) \no_t = sigma_g(W_o x_t + U_o c_t-1 + b_o) \nc_t = f_t odot c_t-1 + i_t odot sigma_c(W_c x_t + b_c) \nh_t = o_t odot sigma_h(c_t)\nendalign\n\nForward\n\npeepholelstm(inp, (state, cstate))\npeepholelstm(inp)\n\nArguments\n\ninp: The input to the peepholelstm. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\n(state, cstate): A tuple containing the hidden and cell states of the PeepholeLSTM. They should be vectors of size hidden_size or matrices of size hidden_size x batch_size. If not provided, they are assumed to be vectors of zeros, initialized by Flux.initialstates.\n\nReturns\n\nNew hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/layers/#RecurrentLayers.FastRNN","page":"Layers","title":"RecurrentLayers.FastRNN","text":"FastRNN((input_size => hidden_size), [activation]; kwargs...)\n\nFast recurrent neural network. See FastRNNCell for a layer that processes a single sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\nactivation: the activation function, defaults to tanh_fast\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\ntildeh_t = sigma(W_h x_t + U_h h_t-1 + b) \nh_t = alpha tildeh_t + beta h_t-1\nendaligned\n\nForward\n\nfastrnn(inp, state)\nfastrnn(inp)\n\nArguments\n\ninp: The input to the fastrnn. It should be a vector of size input_size x len or a matrix of size input_size x len x batch_size.\nstate: The hidden state of the FastRNN. If given, it is a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.\n\nReturns\n\nNew hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"api/layers/#RecurrentLayers.FastGRNN","page":"Layers","title":"RecurrentLayers.FastGRNN","text":"FastGRNN((input_size => hidden_size), [activation]; kwargs...)\n\nFast recurrent neural network. See FastGRNNCell for a layer that processes a single sequences.\n\nArguments\n\ninput_size => hidden_size: input and inner dimension of the layer\nactivation: the activation function, defaults to tanh_fast\ninit_kernel: initializer for the input to hidden weights\ninit_recurrent_kernel: initializer for the hidden to hidden weights\nbias: include a bias or not. Default is true\n\nEquations\n\nbeginaligned\nz_t = sigma(W_z x_t + U_z h_t-1 + b_z) \ntildeh_t = tanh(W_h x_t + U_h h_t-1 + b_h) \nh_t = big((zeta (1 - z_t) + nu) odot tildeh_tbig) + z_t odot h_t-1\nendaligned\n\nForward\n\nfastgrnn(inp, state)\nfastgrnn(inp)\n\nArguments\n\ninp: The input to the fastgrnn. It should be a vector of size input_size or a matrix of size input_size x batch_size.\nstate: The hidden state of the FastGRNN. It should be a vector of size hidden_size or a matrix of size hidden_size x batch_size. If not provided, it is assumed to be a vector of zeros, initialized by Flux.initialstates.\n\nReturns\n\nNew hidden states new_states as an array of size hidden_size x len x batch_size.\n\n\n\n\n\n","category":"type"},{"location":"roadmap/#Roadmap","page":"Roadmap","title":"Roadmap","text":"","category":"section"},{"location":"roadmap/","page":"Roadmap","title":"Roadmap","text":"This page documents some planned work for RecurrentLayers.jl. Future work for this library includes additional cells such as:","category":"page"},{"location":"roadmap/","page":"Roadmap","title":"Roadmap","text":"FastRNNs and FastGRUs (current focus) arxiv\nUnitary recurrent neural networks arxiv\nModern recurrent neural networks such as LRU and minLSTM/minGRU\nQuasi recurrent neural networks arxiv","category":"page"},{"location":"roadmap/","page":"Roadmap","title":"Roadmap","text":"Additionally, some cell-independent architectures are also planned, that expand the ability of recurrent architectures and could theoretically take any cell:","category":"page"},{"location":"roadmap/","page":"Roadmap","title":"Roadmap","text":"Clockwork rnns arxiv\nPhased rnns arxiv\nSegment rnn arxiv\nFast-Slow rnns arxiv","category":"page"},{"location":"roadmap/","page":"Roadmap","title":"Roadmap","text":"An implementation of these ideally would be, for example FastSlow(RNNCell, input_size => hidden_size). More details on this soon!","category":"page"},{"location":"","page":"Home","title":"Home","text":"CurrentModule = RecurrentLayers","category":"page"},{"location":"#RecurrentLayers","page":"Home","title":"RecurrentLayers","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"RecurrentLayers.jl extends Flux.jl recurrent layers offering by providing implementations of bleeding edge recurrent layers not commonly available in base deep learning libraries. It is designed for a seamless integration with the larger Flux ecosystem, enabling researchers and practitioners to leverage the latest developments in recurrent neural networks.","category":"page"},{"location":"#Implemented-layers","page":"Home","title":"Implemented layers","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Minimal gated unit as MGUCell arxiv\nLight gated recurrent unit as LiGRUCell arxiv\nIndependently recurrent neural networks as IndRNNCell arxiv\nRecurrent addictive networks as RANCell arxiv\nRecurrent highway network as RHNCell arixv\nLight recurrent unit as LightRUCell pub\nNeural architecture search unit NASCell arxiv\nEvolving recurrent neural networks as MUT1Cell, MUT2Cell, MUT3Cell pub\nStructurally constrained recurrent neural network as SCRNCell arxiv\nPeephole long short term memory as PeepholeLSTMCell pub\nFastRNNCell and FastGRNNCell arxiv","category":"page"},{"location":"#Contributing","page":"Home","title":"Contributing","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Contributions are always welcome! We specifically look for :","category":"page"},{"location":"","page":"Home","title":"Home","text":"Recurrent cells you would like to see implemented \nBenchmarks\nAny bugs and mistakes of course!\nDocumentation, in any form: examples, how tos, docstrings ","category":"page"}] }