Repeat Copy task #8

tristandeleu · 2015-09-28T10:08:34Z

Repeat Copy task

I will gather all the progress on the Repeat Copy task in this issue. I will likely update this issue regularly (hopefully), so you may want to unsubscribe from this issue if you don't want to get all the spam.

tristandeleu · 2015-09-28T10:26:06Z

Training the NTM on sequences of length one

Just like in #4, I have started the experiments on the Repeat Copy task by training the NTM on sequences of length one with random repetitions between 1 and 5. The training went surprisingly well and it converged in a few thousands iterations (see the learning curve below). Tests on sequences of length 1 show that the NTM was able to properly repeat length-one inputs as it was trained on

We can't clearly see the input due to normalization, the flag representing the number of repetitions being a scalar.

However learning on only sequences of length one may bias the experiment and do not actually learn the proper task. Instead, the NTM seems to have learned the following procedure:

For each vector in the input sequence:
    Write its representation on the first address
Decode and repeat the vector in the first address

Hence the model does not show any sign of generalization yet, and tests on longer sequences seem to confirm that the NTM writes all the input sequence on the first memory address, overwriting previous steps.

Learning curve

Similar to #4

Parameters of the experiment

Same parameters as in #6

tristandeleu · 2015-09-29T09:30:35Z

Training on sequences of length 3-5

As the initial experiment in #8 (comment) suggested, training on very short sequences raises the problem of not learning the proper repeat task.

If we train on sequences of length one, the NTM learns to write on and read from the first address only.
If we train on sequences of length 2, the NTM is likely to favor shifts between two adjacent memory locations rather than going back to the first address.

Therefore I trained the NTM on sequences of length 3 to 5 with a number of repetitions up to 5. After running a few tests, it shows that the NTM is indeed able to repeat the input sequence on the output

What is particularly outstanding is that it seems to show great generalization properties, on both the length of the input sequence and the number of repetitions. Here are a few tests where

The length of the sequence is greater than the training sequences (here 3 repeats of a sequence of length 10)
The number of repetitions is greater than the training data (here 10 repeats of a sequence of length 4)

However, even if the quality in generalization looks good, it happens that sometimes the NTM misses a few bits in its predictions (see below). Another surprising behavior is that the NTM writes the number of repetitions of the same address as the last vector in the input sequence, which may explain that. Maybe when the number of repetitions is too large, then it overwrites the whole or part of the last vector of the input sequence in the memory.

Here we can see that the NTM completely misses the 4th (last) input vector in its prediction. Here we have 20 repetitions of a sequence of length 4.

If we look at the add vectors, we can see that the NTM writes a large value in memory for the number of repetitions at the same locations as the last input vector, which corrupts its representation in the memory and explains why the prediction is incorrect. To fix this, we may have to train on larger number of repetitions.

Learning curve

Parameters of the experiment

Overall the same parameters as in #8 (comment). It is worth noting though that I do not learn the parameters (weight matrix and biases) of key and beta for the write head. They are not needed for this task and seem to be the cause of NaN issues that need to be fixed. (Update 13/10: it now works with all the parameters). The initial weight vectors are still [1, 0, ..., 0] and still not learned (see #2).

tristandeleu added the progress label Sep 28, 2015

tristandeleu mentioned this issue Oct 21, 2015

Associative Recall task #10

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repeat Copy task #8

Repeat Copy task #8

tristandeleu commented Sep 28, 2015

tristandeleu commented Sep 28, 2015

tristandeleu commented Sep 29, 2015

Repeat Copy task #8

Repeat Copy task #8

Comments

tristandeleu commented Sep 28, 2015

Repeat Copy task

tristandeleu commented Sep 28, 2015

Training the NTM on sequences of length one

Learning curve

Parameters of the experiment

tristandeleu commented Sep 29, 2015

Training on sequences of length 3-5

Learning curve

Parameters of the experiment