Move RNN to layers.py and make it stateless. #97

aterzis-google · 2020-10-12T23:00:58Z

No description provided.

objax/nn/layers.py

AlexeyKurakin · 2020-10-14T05:05:56Z

objax/nn/layers.py

@@ -327,6 +327,63 @@ def __call__(self, x: JaxArray) -> JaxArray:
        self.avg.value += (self.avg.value - x) * (self.momentum - 1)
        return self.avg.value

+class RNN(Module):


I think the name RNN is too generic.
Pretty much any type of recurrent block (LSTM, GRU, ....) could be called RNN.
Is there some better way to call it?

Also RNN refers to the architecture, not to the cell. Here's what TF/Keras does https://www.tensorflow.org/api_docs/python/tf/compat/v1/nn/rnn_cell/RNNCell
Not sure what PyTorch does.

This is a specific RNN architecture that operates across time (so not a cell). I would call this something like SimpleRNN; and make sure it replicates keras' SimpleRNN functionality with default arguments:

https://www.tensorflow.org/api_docs/python/tf/keras/layers/SimpleRNN

RNN you could reserve as an object that takes an RNNCell and performs a scan across time.

Changed the name to SimpleRNN

ebrevdo · 2020-10-14T21:47:46Z

objax/nn/layers.py

+
+        self.output_layer = Linear(self.nstate, self.num_outputs)
+
+    def __call__(self, inputs: JaxArray, only_return_final=False) -> JaxArray:


suggest adding a get_initial_state method and optional initial_state argument here

I added an optional initial_state argument to the call() method.

Can you clarify what the get_initial_state() method would do, considering that the state is initialized during every call() (unless explicitly passed in through the optional argument)?

there are two reasons to have a get_initial_state: One, the caller wants to know if this layer is recurrent, without checking for some general instance type. Two, the caller wants to know the shapes etc of the state, without running __call__. This is useful for many reasons, like creating buffers for storing state.

Just to clarify, does get_init_state really act like a create_init_state? Or is there an init_state stored inside the instance?

no; it's a purely functional thing that returns some arrays.

As far as I understood from some of the Keras code, get_initial_state simply returns zero array of appropriate shape (ex: https://github.com/tensorflow/tensorflow/blob/fcc4b966f1265f466e82617020af93670141b009/tensorflow/python/keras/layers/recurrent.py#L1948 )

It's still not very clear how useful it is.
Could you point us to some example of how it's actually used (either in Tensorflow or any other framework)?

To know shape of the state it would be better to just call rnn_cell_layer.nstate or maybe have helper method get_state_shape.
Using get_initial_state as a way to determine whether layer is RNN seems like a little weird. I don't see how getattr(layer, 'get_initial_state') is better than isinstance(layer, RNNCell). If there is a need to determine whether layer is RNN cell, I think it's better just to make all RNN cells to inherit from some base class and do isinstance check.

objax/nn/layers.py

ebrevdo · 2020-10-14T21:49:21Z

objax/nn/layers.py

+            only_return_final: return only the last output if ``True``, or all output otherwise.`
+
+        Returns:
+            Output tensor with dimensions ``N * batch_size, vocabulary_size``.


is vocabulary_size the right terminology for RNNs? perhaps you mean nout here?

Also why is batch_size included here? I thought you don't consider batch_size in these layers?

Changed vocabulary_size -> nout

I include batch_size because we can process a batch of input data.

@david-berthelot do other layers "know" about batch dimensions? does this one need to?

(from david on another PR: no, layers don't know about batch dimensions, so this one shouldn't either. instead, add a unit test with this object and Vectorized)

ebrevdo

Needs a unit test.

jli05

What the RNN stands out by in this lib for me is the code readability and simplicity. Any person can easily extend it.

objax/nn/layers.py

jli05 · 2020-10-21T09:53:25Z

objax/nn/layers.py

+                jn.dot(x, self.w_xh.value)
+                + jn.dot(state, self.w_hh.value)


num_inputs could be zero. -- Essentially empty inputs but internal states continue to evolve along time.

Not sure if we shall use two weight matrices or one to act on concatenated [h, x].

Typically it's more efficient to act on one concatenated [h, x], but depends on the system and sizes. At some point you can make this an __init__ mode parameter like Keras does. For now I'd suggest using the concatenated format.

Another nit, use x.dot(y) rather than jn.dot(x, y) since we might as well take advantage of object oriented APIs.

jli05 · 2020-10-21T09:54:28Z

objax/nn/layers.py

+                + jn.dot(state, self.w_hh.value)
+                + self.b_h.value
+            )
+            y = self.output_layer(state)


Do we need output_layer or can we directly return internal states h and let user do further transform on that?

I opted for having an output_layer

Question why: this is something the user can do themselves after, right? So is there any purpose to add an output_layer?

I would drop the output layer, that's forcing a decision on the user about what type of output they'd want.

initial_state to the constructor, and output RNN state when call() returns.

google-cla · 2020-10-28T19:20:20Z

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

google-cla · 2020-10-28T21:59:36Z

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

ebrevdo · 2020-10-29T15:12:40Z

I am ooo and will return next week.

…

On Wed, Oct 28, 2020, 5:44 PM Andreas Terzis (Google) < ***@***.***> wrote: @aterzis-google <https://github.com/aterzis-google> requested your review on: #97 <#97> Move RNN to layers.py and make it stateless.. — You are receiving this because your review was requested. Reply to this email directly, view it on GitHub <#97 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AANWFG4JVHW6MNO2S5U5WFTSNC3FJANCNFSM4SNTPVWA> .

david-berthelot · 2020-11-02T21:25:26Z

objax/nn/layers.py

+
+        if only_return_final:
+            return y, state
+        else:


No need for else.

david-berthelot · 2020-11-02T21:25:51Z

objax/nn/layers.py

+        if only_return_final:
+            return y, state
+        else:
+            return jn.concatenate(outputs, axis=0), state


Should it be jn.stack?

david-berthelot · 2020-11-02T21:27:52Z

objax/nn/layers.py

+                jn.dot(x, self.w_xh.value)
+                + jn.dot(state, self.w_hh.value)


Another nit, use x.dot(y) rather than jn.dot(x, y) since we might as well take advantage of object oriented APIs.

david-berthelot · 2020-11-02T21:28:23Z

objax/nn/layers.py

+    def __call__(self, inputs: JaxArray, initial_state: JaxArray = None,
+                 only_return_final: bool = False) -> Tuple[JaxArray, JaxArray]:


One argument per line if they don't all fit on one line.

david-berthelot · 2020-11-02T21:29:35Z

examples/text_generation/shakespeare_rnn.py



 def loss(x, label):  # sum(label * log(softmax(logit)))
-    logit = model(x)
-    return objax.functional.loss.cross_entropy_logits(logit, label).mean()
+    logits, _ = model(x)


logits = model(x)[0]

david-berthelot · 2020-11-02T21:31:14Z

examples/text_generation/shakespeare_rnn.py

    outputs = [vocab[prefix[0]]]
    get_input = lambda: one_hot(jn.array([outputs[-1]]).reshape(1, 1), len(vocab))
    for y in prefix[1:]:  # Warmup state with prefix
        model(get_input())
        outputs.append(vocab[y])
    for _ in range(num_predicts):  # Predict num_predicts steps
-        Y = model(get_input())
+        Y, _ = model(get_input())


Uppercase are for global constants, use lower case identifiers for variables please.

Also rather than doing two assigns, the better way is to just assign what you use.
Y = model(get_input())[0]

david-berthelot · 2020-11-02T21:31:47Z

examples/text_generation/shakespeare_rnn.py

+<<<<<<< HEAD:examples/text_generation/shakespeare_rnn.py
+from objax.nn import SimpleRNN
+=======
+from objax.nn import RNN
+>>>>>>> 2c04d4e (Move RNN to layers.py and make it stateless.):examples/rnn/shakespeare.py


Your commit contains an unresolved merge.

aterzis-google requested review from david-berthelot and AlexeyKurakin October 12, 2020 23:01

AlexeyKurakin reviewed Oct 14, 2020

View reviewed changes

aterzis-google requested a review from ebrevdo October 14, 2020 16:29

ebrevdo reviewed Oct 14, 2020

View reviewed changes

objax/nn/layers.py Outdated Show resolved Hide resolved

ebrevdo reviewed Oct 14, 2020

View reviewed changes

ebrevdo suggested changes Oct 19, 2020

View reviewed changes

aterzis-google mentioned this pull request Oct 20, 2020

Could you outline how to write a simplest RNN Module? #31

Open

jli05 reviewed Oct 21, 2020

View reviewed changes

aterzis-google added 3 commits October 28, 2020 11:54

Move RNN to layers.py and make it stateless.

311e211

Rename RNN to SimpleRNN, add the ability to pass

5f6e4af

initial_state to the constructor, and output RNN state when call() returns.

Updated RNN test.

853d4f0

aterzis-google added 2 commits October 28, 2020 16:16

Move RNN to layers.py and make it stateless.

f7774f4

Fix linter errors.

efcb605

aterzis-google force-pushed the rnn-stateless branch from 619be74 to efcb605 Compare October 28, 2020 23:23

aterzis-google requested review from ebrevdo and AlexeyKurakin October 29, 2020 00:43

david-berthelot suggested changes Nov 2, 2020

View reviewed changes

david-berthelot mentioned this pull request Dec 17, 2020

Possible RNN design for comment. #185

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move RNN to layers.py and make it stateless. #97

Move RNN to layers.py and make it stateless. #97

aterzis-google commented Oct 12, 2020

AlexeyKurakin Oct 14, 2020

david-berthelot Oct 14, 2020 •

edited

Loading

ebrevdo Oct 14, 2020

aterzis-google Oct 29, 2020

ebrevdo Oct 14, 2020

aterzis-google Oct 29, 2020

ebrevdo Nov 2, 2020

david-berthelot Nov 2, 2020

ebrevdo Nov 2, 2020

AlexeyKurakin Dec 2, 2020

ebrevdo Oct 14, 2020

aterzis-google Oct 29, 2020

ebrevdo Nov 2, 2020

ebrevdo Nov 3, 2020 •

edited

Loading

ebrevdo left a comment

jli05 left a comment

jli05 Oct 21, 2020

ebrevdo Nov 2, 2020

david-berthelot Nov 2, 2020

jli05 Oct 21, 2020

aterzis-google Oct 29, 2020

ebrevdo Nov 2, 2020

david-berthelot Nov 2, 2020

google-cla bot commented Oct 28, 2020

google-cla bot commented Oct 28, 2020

ebrevdo commented Oct 29, 2020 via email

david-berthelot Nov 2, 2020

david-berthelot Nov 2, 2020

david-berthelot Nov 2, 2020

david-berthelot Nov 2, 2020

david-berthelot Nov 2, 2020

david-berthelot Nov 2, 2020

david-berthelot Nov 2, 2020


		self.output_layer = Linear(self.nstate, self.num_outputs)

		def __call__(self, inputs: JaxArray, only_return_final=False) -> JaxArray:

		def __call__(self, inputs: JaxArray, initial_state: JaxArray = None,
		only_return_final: bool = False) -> Tuple[JaxArray, JaxArray]:

Move RNN to layers.py and make it stateless. #97

Are you sure you want to change the base?

Move RNN to layers.py and make it stateless. #97

Conversation

aterzis-google commented Oct 12, 2020

Choose a reason for hiding this comment

david-berthelot Oct 14, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ebrevdo Nov 3, 2020 • edited Loading

Choose a reason for hiding this comment

ebrevdo left a comment

Choose a reason for hiding this comment

jli05 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

google-cla bot commented Oct 28, 2020

google-cla bot commented Oct 28, 2020

ebrevdo commented Oct 29, 2020 via email

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

david-berthelot Oct 14, 2020 •

edited

Loading

ebrevdo Nov 3, 2020 •

edited

Loading