ValueError: could not broadcast input array from shape (99) into shape (100) #148

jaqsro · 2019-04-30T18:26:22Z

I'm trying to train a model using Portuguese language, to make this happen I followed the steps described here #30.
The word vectors I downloaded from here GLOVE 100 dimensões.
l updated the parameters.ini file accordingly and ran python main.py, but I am getting the following error:

python __main__.py 
{'train_model': 1, 'use_pretrained_model': 0, 'pretrained_model_folder': './trained_models/jaq_first_model', 'dataset_text_folder': './data/pt/vagas', 'main_evaluation_mode': 'conll', 'output_folder': './output', 'use_character_lstm': 1, 'character_embedding_dimension': 25, 'character_lstm_hidden_state_dimension': 25, 'token_pretrained_embedding_filepath': './data/word_vectors/cbow_s100.txt', 'token_embedding_dimension': 100, 'token_lstm_hidden_state_dimension': 1, 'use_crf': 1, 'patience': 10, 'maximum_number_of_epochs': 100, 'optimizer': 'sgd', 'learning_rate': 0.005, 'gradient_clipping_value': 5.0, 'dropout_rate': 0.5, 'number_of_cpu_threads': 8, 'number_of_gpus': 0, 'experiment_name': 'test', 'output_scores': 0, 'tagging_format': 'bioes', 'tokenizer': 'spacy', 'spacylanguage': 'pt', 'remap_unknown_tokens_to_unk': 1, 'load_only_pretrained_token_embeddings': 0, 'load_all_pretrained_token_embeddings': 'False', 'check_for_lowercase': 1, 'check_for_digits_replaced_with_zeros': 1, 'freeze_token_embeddings': 0, 'debug': 0, 'verbose': 0, 'plot_format': 'pdf', 'reload_character_embeddings': 1, 'reload_character_lstm': 1, 'reload_token_embeddings': 1, 'reload_token_lstm': 1, 'reload_feedforward': 1, 'reload_crf': 1, 'parameters_filepath': './parameters.ini', 'fetch_data': '', 'fetch_trained_model': ''}
Checking compatibility between CONLL and BRAT for train_spacy set ... Done.
Checking validity of CONLL BIOES format... Done.
Checking compatibility between CONLL and BRAT for valid_spacy set ... Done.
Checking validity of CONLL BIOES format... Done.
Checking compatibility between CONLL and BRAT for test_spacy set ... Done.
Checking validity of CONLL BIOES format... Done.
Load dataset... done (84.74 seconds)
WARNING:tensorflow:From /home/deploy-user/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From /home/deploy-user/anaconda3/lib/python3.6/site-packages/neuroner/entity_lstm.py:46: bidirectional_dynamic_rnn (from tensorflow.python.ops.rnn) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `keras.layers.Bidirectional(keras.layers.RNN(cell))`, which is equivalent to this API
WARNING:tensorflow:From /home/deploy-user/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/rnn.py:443: dynamic_rnn (from tensorflow.python.ops.rnn) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `keras.layers.RNN(cell)`, which is equivalent to this API
WARNING:tensorflow:From /home/deploy-user/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/rnn.py:626: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
WARNING:tensorflow:From /home/deploy-user/anaconda3/lib/python3.6/site-packages/neuroner/entity_lstm.py:146: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
**Load token embeddings... Traceback (most recent call last):
  File "__main__.py", line 114, in <module>
    main()
  File "__main__.py", line 109, in main
    nn = neuromodel.NeuroNER(**arguments)
  File "/home/deploy-user/anaconda3/lib/python3.6/site-packages/neuroner/neuromodel.py", line 485, in __init__
    self.parameters, token_to_vector)
  File "/home/deploy-user/anaconda3/lib/python3.6/site-packages/neuroner/entity_lstm.py", line 334, in load_pretrained_token_embeddings
    initial_weights[dataset.token_to_index[token]] = token_to_vector[token]
ValueError: could not broadcast input array from shape (99) into shape (100)**

I did the same on a docker version for Neuroner and it worked, but I noticed the version of Neuroner in the docker is a little older and I wanted to make it work with the newest version available.

Thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ValueError: could not broadcast input array from shape (99) into shape (100) #148

ValueError: could not broadcast input array from shape (99) into shape (100) #148

jaqsro commented Apr 30, 2019 •

edited by tompollard

Loading

ValueError: could not broadcast input array from shape (99) into shape (100) #148

ValueError: could not broadcast input array from shape (99) into shape (100) #148

Comments

jaqsro commented Apr 30, 2019 • edited by tompollard Loading

jaqsro commented Apr 30, 2019 •

edited by tompollard

Loading