You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I'm trying to implement the consensus opinion (levels_mean) as a geometric mean of the top-down predictions, bottom-up predictions, attention-weighted average of same-level embeddings, and embeddings of the previous time step as described by the original paper. Any ideas on how the weights should be set?
At first I thought this could be a learnable parameter, but section 9.1 reads
For interpreting a static image with no temporal context, the weights used for this weighted geometric mean need to change during the iterations that occur after a new fixation.
which leads me to believe that these might need to be outputted on the fly a la vanilla attention as opposed to being learned. Maybe an MLP that takes in the four source embeddings and outputs four scalars as weights?
The text was updated successfully, but these errors were encountered:
Hi, I'm trying to implement the consensus opinion (
levels_mean
) as a geometric mean of the top-down predictions, bottom-up predictions, attention-weighted average of same-level embeddings, and embeddings of the previous time step as described by the original paper. Any ideas on how the weights should be set?At first I thought this could be a learnable parameter, but section 9.1 reads
which leads me to believe that these might need to be outputted on the fly a la vanilla attention as opposed to being learned. Maybe an MLP that takes in the four source embeddings and outputs four scalars as weights?
The text was updated successfully, but these errors were encountered: