Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: could not broadcast input array from shape (99) into shape (100) #148

Open
jaqsro opened this issue Apr 30, 2019 · 0 comments
Open

Comments

@jaqsro
Copy link

jaqsro commented Apr 30, 2019

I'm trying to train a model using Portuguese language, to make this happen I followed the steps described here #30.
The word vectors I downloaded from here GLOVE 100 dimensões.
l updated the parameters.ini file accordingly and ran python main.py, but I am getting the following error:

python __main__.py 
{'train_model': 1, 'use_pretrained_model': 0, 'pretrained_model_folder': './trained_models/jaq_first_model', 'dataset_text_folder': './data/pt/vagas', 'main_evaluation_mode': 'conll', 'output_folder': './output', 'use_character_lstm': 1, 'character_embedding_dimension': 25, 'character_lstm_hidden_state_dimension': 25, 'token_pretrained_embedding_filepath': './data/word_vectors/cbow_s100.txt', 'token_embedding_dimension': 100, 'token_lstm_hidden_state_dimension': 1, 'use_crf': 1, 'patience': 10, 'maximum_number_of_epochs': 100, 'optimizer': 'sgd', 'learning_rate': 0.005, 'gradient_clipping_value': 5.0, 'dropout_rate': 0.5, 'number_of_cpu_threads': 8, 'number_of_gpus': 0, 'experiment_name': 'test', 'output_scores': 0, 'tagging_format': 'bioes', 'tokenizer': 'spacy', 'spacylanguage': 'pt', 'remap_unknown_tokens_to_unk': 1, 'load_only_pretrained_token_embeddings': 0, 'load_all_pretrained_token_embeddings': 'False', 'check_for_lowercase': 1, 'check_for_digits_replaced_with_zeros': 1, 'freeze_token_embeddings': 0, 'debug': 0, 'verbose': 0, 'plot_format': 'pdf', 'reload_character_embeddings': 1, 'reload_character_lstm': 1, 'reload_token_embeddings': 1, 'reload_token_lstm': 1, 'reload_feedforward': 1, 'reload_crf': 1, 'parameters_filepath': './parameters.ini', 'fetch_data': '', 'fetch_trained_model': ''}
Checking compatibility between CONLL and BRAT for train_spacy set ... Done.
Checking validity of CONLL BIOES format... Done.
Checking compatibility between CONLL and BRAT for valid_spacy set ... Done.
Checking validity of CONLL BIOES format... Done.
Checking compatibility between CONLL and BRAT for test_spacy set ... Done.
Checking validity of CONLL BIOES format... Done.
Load dataset... done (84.74 seconds)
WARNING:tensorflow:From /home/deploy-user/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From /home/deploy-user/anaconda3/lib/python3.6/site-packages/neuroner/entity_lstm.py:46: bidirectional_dynamic_rnn (from tensorflow.python.ops.rnn) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `keras.layers.Bidirectional(keras.layers.RNN(cell))`, which is equivalent to this API
WARNING:tensorflow:From /home/deploy-user/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/rnn.py:443: dynamic_rnn (from tensorflow.python.ops.rnn) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `keras.layers.RNN(cell)`, which is equivalent to this API
WARNING:tensorflow:From /home/deploy-user/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/rnn.py:626: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
WARNING:tensorflow:From /home/deploy-user/anaconda3/lib/python3.6/site-packages/neuroner/entity_lstm.py:146: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
**Load token embeddings... Traceback (most recent call last):
  File "__main__.py", line 114, in <module>
    main()
  File "__main__.py", line 109, in main
    nn = neuromodel.NeuroNER(**arguments)
  File "/home/deploy-user/anaconda3/lib/python3.6/site-packages/neuroner/neuromodel.py", line 485, in __init__
    self.parameters, token_to_vector)
  File "/home/deploy-user/anaconda3/lib/python3.6/site-packages/neuroner/entity_lstm.py", line 334, in load_pretrained_token_embeddings
    initial_weights[dataset.token_to_index[token]] = token_to_vector[token]
ValueError: could not broadcast input array from shape (99) into shape (100)**

I did the same on a docker version for Neuroner and it worked, but I noticed the version of Neuroner in the docker is a little older and I wanted to make it work with the newest version available.

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant