DQN-Social Attention in Highway-env #79

SimoMaestri · 2022-04-22T08:11:27Z

Hi,
i've trained a DQN model with social attention in HighwayEnv. I've used egoattention with 2 heads but i don't understand the results.
In the first video you can see the output when the model is trained with 1000 episodes. In the second video the model is trained with 3000 episodes.
I've noticed that, in the first case, one head gives attention to all the vehicles that are on the left or in front of the ego-vehicle, while the other head gives attention to vehicles that are on the right lane.
In the second case, it's all different. Both heads gives more attention to vehicles that are behind the ego-vehicle and i don't understant why it happens. It's also strange that both heads gives attention to the same vehicle. Can you give some explanation about it?

1000episodes:

download.mp4

3000 episodes:
https://user-images.githubusercontent.com/32385644/164644814-43cdf4de-64c6-4b22-978b-116583bd56ad.mp4

eleurent · 2022-04-23T10:18:11Z

Hi @SimoMaestri
Very interesting results, thank you for sharing them!

I think that although we can observe and describe what these models do, it is often quite difficult to explain why they do it.
What these attention heads end up doing is really is a byproduct of training and representation learning, and they can fall into different local maxima under different conditions (network initialisation, random data generated through exploration, etc).

I guess the bottomline is that the attention-based representation network seems to have no incentive to do anything different (than having two heads looking at the same vehicles behind) as long as it works (i.e. induces high rewards), so the question is rather: why does this strategy work? Does the 2nd policy (3k episodes) indeed achieves similar performance as the 1st one (1k episodes)?

eleurent · 2022-04-23T10:43:06Z

Actually, it looks like the 2nd policy does look at front vehicles but only when they are very close:

So the information of a possibly imminent collision is present in the output, even if its not weighted 100%

Also, after overtaking a vehicle, we can see that the model is still attending to them for a little while:

but this makes sense especially when you have to take a lane change decision, to avoid rear-collisions when cutting before another vehicle.

However, that rear-attention drops whenever the vehicle gets a bit further, and the risk of collision from a lane change disappears:

This happens three times: at 0:03, 0:06, and 0:09, so it looks like a consistent behaviour.

So the last mystery is why the attention focuses on the single vehicle in the back most of the time: I would guess it is kind of a default behaviour that emerged and occurs whenever there is no imminent danger and no decision to take. Then, it does not really matter where you look anyway, as long as you can switch your attention back when anything comes up. In my own experiments, the default vehicle that is observed when nothing in particular is happening is often the ego-vehicle itself, like e.g. here: https://raw.githubusercontent.com/eleurent/social-attention/master/assets/straight.mp4

But it is not always the case, and you can see e.g. here: https://raw.githubusercontent.com/eleurent/social-attention/master/assets/2_distance.mp4 that after the important decisions have been taken and the vehicle can safely proceed (at 0:06), both attention heads are also looking at a vehicle way behind which seems irrelevant, just like in your 3k training example.

So although it is hard to justify exactly, my best guess would be that in these nominal situations, the (hypothetical) collision-detection neurons which normally drive the attention scores in dangerous states are inhibited, and so the attention just focuses on anything because it has to from the softmax formulation. It could also be uniform, which would probably feel better to us, but it doesnt has to, as long as it works.
Again, that is just a guess, and I won't fight for it ^^

SimoMaestri · 2022-04-29T08:25:57Z

Hi, thank you so much for this exhaustive and very clear explanation.

Shuchengzhang888 · 2023-02-28T23:17:50Z

I met the same problem when I trained attention model in highway-env, but I think it's just a bug.

In Kinematic Observation, it uses MAXSPEED to normalize the observation. It is fine for other NPC in highway-env, because it uses the relative coordinates, but for ego vehicle which uses absolute coordinate, it cannot work after the driving distance over the upper boundary. That means it is always be 1.

self.features_range = {
                "x": [-5.0 * Vehicle.MAX_SPEED, 5.0 * Vehicle.MAX_SPEED],
                "y": [-AbstractLane.DEFAULT_WIDTH * len(side_lanes), AbstractLane.DEFAULT_WIDTH * len(side_lanes)],
                "vx": [-2*Vehicle.MAX_SPEED, 2*Vehicle.MAX_SPEED],
                "vy": [-2*Vehicle.MAX_SPEED, 2*Vehicle.MAX_SPEED]
            }

In this case, when it compute the attention, the code will regard the ego vehicle as the other one which is behind all other vehicle. That is why it always keep attention to the vehicle which is behind it, even it is not in the input. I just add a "if" here, and it can work well.

deep_q_network/ graphics.py

 if v_index == 0:
                vehicle = agent.env.unwrapped.vehicle
  else:
                v_position = {}
                for feature in ["x", "y"]:
                    v_feature = state[v_index, obs_type.features.index(feature)]
                    v_feature = remap(v_feature, [-1, 1], obs_type.features_range[feature])
                    v_position[feature] = v_feature
                v_position = np.array([v_position["x"], v_position["y"]])
                if not obs_type.absolute and v_index > 0:
                    v_position += agent.env.unwrapped.vehicle.position
                vehicle = min(agent.env.road.vehicles, key=lambda v: np.linalg.norm(v.position - v_position))
            v_attention[vehicle] = attention[:, v_index]

eleurent · 2023-03-04T14:12:10Z

Oooooh, that's a great catch @Shuchengzhang888! thanks :)
Yeah, this observation normalisation - denormalisation in the attention visualisation is really ugly. Maybe it would be better to directly include vehicle indices in the obs (and mask them from the model), instead of mapping to the closest vehicle based on position...

…ounds see #79

eleurent added a commit that referenced this issue Mar 4, 2023

Fix visualisation bug from unnormalising ego-vehicle out of feature b…

02e0303

…ounds see #79

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DQN-Social Attention in Highway-env #79

DQN-Social Attention in Highway-env #79

SimoMaestri commented Apr 22, 2022

eleurent commented Apr 23, 2022

eleurent commented Apr 23, 2022

SimoMaestri commented Apr 29, 2022

Shuchengzhang888 commented Feb 28, 2023 •

edited

Loading

eleurent commented Mar 4, 2023

DQN-Social Attention in Highway-env #79

DQN-Social Attention in Highway-env #79

Comments

SimoMaestri commented Apr 22, 2022

eleurent commented Apr 23, 2022

eleurent commented Apr 23, 2022

SimoMaestri commented Apr 29, 2022

Shuchengzhang888 commented Feb 28, 2023 • edited Loading

eleurent commented Mar 4, 2023

Shuchengzhang888 commented Feb 28, 2023 •

edited

Loading