-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Repeat Copy task #8
Comments
Training the NTM on sequences of length oneJust like in #4, I have started the experiments on the Repeat Copy task by training the NTM on sequences of length one with random repetitions between 1 and 5. The training went surprisingly well and it converged in a few thousands iterations (see the learning curve below). Tests on sequences of length 1 show that the NTM was able to properly repeat length-one inputs as it was trained on However learning on only sequences of length one may bias the experiment and do not actually learn the proper task. Instead, the NTM seems to have learned the following procedure:
Hence the model does not show any sign of generalization yet, and tests on longer sequences seem to confirm that the NTM writes all the input sequence on the first memory address, overwriting previous steps. Learning curveSimilar to #4 Parameters of the experimentSame parameters as in #6 |
Training on sequences of length 3-5As the initial experiment in #8 (comment) suggested, training on very short sequences raises the problem of not learning the proper repeat task.
Therefore I trained the NTM on sequences of length 3 to 5 with a number of repetitions up to 5. After running a few tests, it shows that the NTM is indeed able to repeat the input sequence on the output What is particularly outstanding is that it seems to show great generalization properties, on both the length of the input sequence and the number of repetitions. Here are a few tests where
However, even if the quality in generalization looks good, it happens that sometimes the NTM misses a few bits in its predictions (see below). Another surprising behavior is that the NTM writes the number of repetitions of the same address as the last vector in the input sequence, which may explain that. Maybe when the number of repetitions is too large, then it overwrites the whole or part of the last vector of the input sequence in the memory. If we look at the Learning curveParameters of the experimentOverall the same parameters as in #8 (comment). |
Repeat Copy task
I will gather all the progress on the Repeat Copy task in this issue. I will likely update this issue regularly (hopefully), so you may want to unsubscribe from this issue if you don't want to get all the spam.
The text was updated successfully, but these errors were encountered: