We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
在HAN的attention里面看到: attetion_logits = tf.reduce_sum(hidden_state_context_similarity,axis = 2) attention_logits_max = tf.reduce_max(attention_logits, axis = 1,keep_dims = True) p_attention = tf.nn.softmax(attetion_logits-attention_logits_max) 原论文里没看到这个操作,请问这是为什么呢?
attetion_logits = tf.reduce_sum(hidden_state_context_similarity,axis = 2) attention_logits_max = tf.reduce_max(attention_logits, axis = 1,keep_dims = True) p_attention = tf.nn.softmax(attetion_logits-attention_logits_max)
The text was updated successfully, but these errors were encountered:
This is because of the fact that softmax is shift-invariant by a constant offset in the input.
softmax is invariant under translation by the same value in each coordinate See wikipedia and a StackOverflow answer.
Deducing the maximum value in the attention_logits allows a faster and more stable numerical computation.
Sorry, something went wrong.
No branches or pull requests
在HAN的attention里面看到:
attetion_logits = tf.reduce_sum(hidden_state_context_similarity,axis = 2) attention_logits_max = tf.reduce_max(attention_logits, axis = 1,keep_dims = True) p_attention = tf.nn.softmax(attetion_logits-attention_logits_max)
原论文里没看到这个操作,请问这是为什么呢?
The text was updated successfully, but these errors were encountered: