Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeseries example not reproducible #2018

Open
twoody2007 opened this issue Dec 31, 2024 · 0 comments
Open

Timeseries example not reproducible #2018

twoody2007 opened this issue Dec 31, 2024 · 0 comments
Assignees

Comments

@twoody2007
Copy link

Issue Type

Documentation Bug

Source

binary

Keras Version

3.7.0

Custom Code

No

OS Platform and Distribution

Ubuntu 22.04

Python version

3.12.8

GPU model and memory

RTX 5000 Ada

Current Behavior?

Running the code from this time series example does not produce the same number of parameters as the example output in the documentation. Further, the model does not achieve stated accuracy.

The colab link also has the same issue.

What is strange is that the doc's final dense layer has ~64K params while running the code produces 2x the mlp unit input, which is 128. I tried increasing it to see if that fixed the problem, but it seems that there is something structurally different between how this code runs on 2.4 vs 3.7.0.

I expected the close to the same output as what is documented on the page.

Standalone code to reproduce the issue or tutorial link

You can run the colab example:
* https://colab.research.google.com/github/keras-team/keras-io/blob/master/examples/timeseries/ipynb/timeseries_classification_transformer.ipynb)

or run the code located here:
* https://github.com/keras-team/keras-io/blob/master/examples/timeseries/timeseries_classification_transformer.py

Relevant log output

(tcap) travis@travis-p1-g6:~/projects/tetra_capital$ python scripts/example_classifications.py 
Model: "functional"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                  ┃ Output Shape              ┃         Param # ┃ Connected to               ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ input_layer (InputLayer)      │ (None, 500, 1)            │               0 │ -                          │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ multi_head_attention          │ (None, 500, 1)            │           7,169 │ input_layer[0][0],         │
│ (MultiHeadAttention)          │                           │                 │ input_layer[0][0]          │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ dropout_1 (Dropout)           │ (None, 500, 1)            │               0 │ multi_head_attention[0][0] │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ layer_normalization           │ (None, 500, 1)            │               2 │ dropout_1[0][0]            │
│ (LayerNormalization)          │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ add (Add)                     │ (None, 500, 1)            │               0 │ layer_normalization[0][0], │
│                               │                           │                 │ input_layer[0][0]          │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ conv1d (Conv1D)               │ (None, 500, 4)            │               8 │ add[0][0]                  │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ dropout_2 (Dropout)           │ (None, 500, 4)            │               0 │ conv1d[0][0]               │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ conv1d_1 (Conv1D)             │ (None, 500, 1)            │               5 │ dropout_2[0][0]            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ layer_normalization_1         │ (None, 500, 1)            │               2 │ conv1d_1[0][0]             │
│ (LayerNormalization)          │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ add_1 (Add)                   │ (None, 500, 1)            │               0 │ layer_normalization_1[0][… │
│                               │                           │                 │ add[0][0]                  │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ multi_head_attention_1        │ (None, 500, 1)            │           7,169 │ add_1[0][0], add_1[0][0]   │
│ (MultiHeadAttention)          │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ dropout_4 (Dropout)           │ (None, 500, 1)            │               0 │ multi_head_attention_1[0]… │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ layer_normalization_2         │ (None, 500, 1)            │               2 │ dropout_4[0][0]            │
│ (LayerNormalization)          │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ add_2 (Add)                   │ (None, 500, 1)            │               0 │ layer_normalization_2[0][… │
│                               │                           │                 │ add_1[0][0]                │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ conv1d_2 (Conv1D)             │ (None, 500, 4)            │               8 │ add_2[0][0]                │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ dropout_5 (Dropout)           │ (None, 500, 4)            │               0 │ conv1d_2[0][0]             │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ conv1d_3 (Conv1D)             │ (None, 500, 1)            │               5 │ dropout_5[0][0]            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ layer_normalization_3         │ (None, 500, 1)            │               2 │ conv1d_3[0][0]             │
│ (LayerNormalization)          │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ add_3 (Add)                   │ (None, 500, 1)            │               0 │ layer_normalization_3[0][… │
│                               │                           │                 │ add_2[0][0]                │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ multi_head_attention_2        │ (None, 500, 1)            │           7,169 │ add_3[0][0], add_3[0][0]   │
│ (MultiHeadAttention)          │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ dropout_7 (Dropout)           │ (None, 500, 1)            │               0 │ multi_head_attention_2[0]… │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ layer_normalization_4         │ (None, 500, 1)            │               2 │ dropout_7[0][0]            │
│ (LayerNormalization)          │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ add_4 (Add)                   │ (None, 500, 1)            │               0 │ layer_normalization_4[0][… │
│                               │                           │                 │ add_3[0][0]                │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ conv1d_4 (Conv1D)             │ (None, 500, 4)            │               8 │ add_4[0][0]                │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ dropout_8 (Dropout)           │ (None, 500, 4)            │               0 │ conv1d_4[0][0]             │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ conv1d_5 (Conv1D)             │ (None, 500, 1)            │               5 │ dropout_8[0][0]            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ layer_normalization_5         │ (None, 500, 1)            │               2 │ conv1d_5[0][0]             │
│ (LayerNormalization)          │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ add_5 (Add)                   │ (None, 500, 1)            │               0 │ layer_normalization_5[0][… │
│                               │                           │                 │ add_4[0][0]                │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ multi_head_attention_3        │ (None, 500, 1)            │           7,169 │ add_5[0][0], add_5[0][0]   │
│ (MultiHeadAttention)          │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ dropout_10 (Dropout)          │ (None, 500, 1)            │               0 │ multi_head_attention_3[0]… │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ layer_normalization_6         │ (None, 500, 1)            │               2 │ dropout_10[0][0]           │
│ (LayerNormalization)          │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ add_6 (Add)                   │ (None, 500, 1)            │               0 │ layer_normalization_6[0][… │
│                               │                           │                 │ add_5[0][0]                │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ conv1d_6 (Conv1D)             │ (None, 500, 4)            │               8 │ add_6[0][0]                │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ dropout_11 (Dropout)          │ (None, 500, 4)            │               0 │ conv1d_6[0][0]             │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ conv1d_7 (Conv1D)             │ (None, 500, 1)            │               5 │ dropout_11[0][0]           │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ layer_normalization_7         │ (None, 500, 1)            │               2 │ conv1d_7[0][0]             │
│ (LayerNormalization)          │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ add_7 (Add)                   │ (None, 500, 1)            │               0 │ layer_normalization_7[0][… │
│                               │                           │                 │ add_6[0][0]                │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ global_average_pooling1d      │ (None, 1)                 │               0 │ add_7[0][0]                │
│ (GlobalAveragePooling1D)      │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ dense (Dense)                 │ (None, 2048)              │           4,096 │ global_average_pooling1d[… │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ dropout_12 (Dropout)          │ (None, 2048)              │               0 │ dense[0][0]                │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ dense_1 (Dense)               │ (None, 2)                 │           4,098 │ dropout_12[0][0]           │
└───────────────────────────────┴───────────────────────────┴─────────────────┴────────────────────────────┘
 Total params: 36,938 (144.29 KB)
 Trainable params: 36,938 (144.29 KB)
 Non-trainable params: 0 (0.00 B)
Epoch 1/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 22s 284ms/step - loss: 0.6932 - sparse_categorical_accuracy: 0.5079 - val_loss: 0.6927 - val_sparse_categorical_accuracy: 0.5354
Epoch 2/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 9s 85ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.5024 - val_loss: 0.6925 - val_sparse_categorical_accuracy: 0.5354
Epoch 3/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 84ms/step - loss: 0.6932 - sparse_categorical_accuracy: 0.5005 - val_loss: 0.6926 - val_sparse_categorical_accuracy: 0.5354
Epoch 4/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.5031 - val_loss: 0.6925 - val_sparse_categorical_accuracy: 0.5354
Epoch 5/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6929 - sparse_categorical_accuracy: 0.5155 - val_loss: 0.6923 - val_sparse_categorical_accuracy: 0.5354
Epoch 6/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6933 - sparse_categorical_accuracy: 0.5004 - val_loss: 0.6924 - val_sparse_categorical_accuracy: 0.5354
Epoch 7/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.5078 - val_loss: 0.6922 - val_sparse_categorical_accuracy: 0.5354
Epoch 8/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6930 - sparse_categorical_accuracy: 0.5096 - val_loss: 0.6923 - val_sparse_categorical_accuracy: 0.5354
Epoch 9/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6929 - sparse_categorical_accuracy: 0.5131 - val_loss: 0.6923 - val_sparse_categorical_accuracy: 0.5354
Epoch 10/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6928 - sparse_categorical_accuracy: 0.5196 - val_loss: 0.6923 - val_sparse_categorical_accuracy: 0.5354
Epoch 11/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6932 - sparse_categorical_accuracy: 0.5021 - val_loss: 0.6923 - val_sparse_categorical_accuracy: 0.5354
Epoch 12/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6934 - sparse_categorical_accuracy: 0.4936 - val_loss: 0.6924 - val_sparse_categorical_accuracy: 0.5354
Epoch 13/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6928 - sparse_categorical_accuracy: 0.5176 - val_loss: 0.6922 - val_sparse_categorical_accuracy: 0.5354
Epoch 14/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6933 - sparse_categorical_accuracy: 0.4975 - val_loss: 0.6923 - val_sparse_categorical_accuracy: 0.5354
Epoch 15/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6930 - sparse_categorical_accuracy: 0.5098 - val_loss: 0.6922 - val_sparse_categorical_accuracy: 0.5354
Epoch 16/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6930 - sparse_categorical_accuracy: 0.5078 - val_loss: 0.6922 - val_sparse_categorical_accuracy: 0.5354
Epoch 17/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6927 - sparse_categorical_accuracy: 0.5171 - val_loss: 0.6922 - val_sparse_categorical_accuracy: 0.5354
Epoch 18/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6929 - sparse_categorical_accuracy: 0.5118 - val_loss: 0.6922 - val_sparse_categorical_accuracy: 0.5354
Epoch 19/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6930 - sparse_categorical_accuracy: 0.5079 - val_loss: 0.6922 - val_sparse_categorical_accuracy: 0.5354
Epoch 20/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 84ms/step - loss: 0.6932 - sparse_categorical_accuracy: 0.5029 - val_loss: 0.6923 - val_sparse_categorical_accuracy: 0.5354
Epoch 21/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6930 - sparse_categorical_accuracy: 0.5075 - val_loss: 0.6923 - val_sparse_categorical_accuracy: 0.5354
Epoch 22/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6929 - sparse_categorical_accuracy: 0.5145 - val_loss: 0.6922 - val_sparse_categorical_accuracy: 0.5354
Epoch 23/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6930 - sparse_categorical_accuracy: 0.5101 - val_loss: 0.6922 - val_sparse_categorical_accuracy: 0.5354
Epoch 24/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6930 - sparse_categorical_accuracy: 0.5090 - val_loss: 0.6923 - val_sparse_categorical_accuracy: 0.5354
Epoch 25/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6930 - sparse_categorical_accuracy: 0.5079 - val_loss: 0.6922 - val_sparse_categorical_accuracy: 0.5354
42/42 ━━━━━━━━━━━━━━━━━━━━ 5s 58ms/step - loss: 0.6925 - sparse_categorical_accuracy: 0.5264 0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants