enet_peer_timeout() appears to have incorrect behavior for timeoutLimit parameter #159

cgutman · 2021-05-12T05:29:26Z

The documentation isn't crystal clear on this, but it seems like the timeoutLimit parameter for the enet_peer_timeout() function does not work as intended.

The documentation regarding the function says the following:

Timeout values use an exponential backoff mechanism, where if a reliable packet is not acknowledge within some multiple of the average RTT plus a variance tolerance, the timeout will be doubled until it reaches a set limit.

If the timeout is thus at this limit and reliable packets have been sent but not acknowledged within a certain minimum time period, the peer will be disconnected.

I have found ENet to behave correctly in accordance with sentence 2 based on the logic here:

enet/protocol.c

Lines 1355 to 1363 in cf735e6

    
           if (peer -> earliestTimeout != 0 && 
        
                 (ENET_TIME_DIFFERENCE (host -> serviceTime, peer -> earliestTimeout) >= peer -> timeoutMaximum || 
        
                   (outgoingCommand -> roundTripTimeout >= outgoingCommand -> roundTripTimeoutLimit && 
        
                     ENET_TIME_DIFFERENCE (host -> serviceTime, peer -> earliestTimeout) >= peer -> timeoutMinimum))) 
        
           { 
        
              enet_protocol_notify_disconnect (host, peer, event); 
        
              return 1; 
        
           }

However, I have not found ENet to behave correctly in accordance with sentence 1. As I understand sentence 1 (and based on reading some of the RTO handling code), timeoutLimit is intended to act as a maximum multiplier for the RTO. Once the RTO multiplier reaches the maximum allowed backoff (timeoutLimit), it should not back off further and continue retransmissions with the maximum RTO until timeoutMinimum expires when the connection would be terminated.

For simplicity, consider the following basic scenario:

RTT = 10
RTT variance = 0
enet_peer_timeout(peer, 4, 200, 200) was invoked

In this situation, I would expect the transmissions to occur at:

0 ms (initial transmission)
10 ms (RTO = RTT * 1 = 10 ms)
30 ms (RTO = RTT * 2 = 20 ms)
70 ms (RTO = RTT * 4 = 40 ms - max RTO reached due to timeoutLimit)
110 ms (RTO = RTT * 4 = 40 ms)
150 ms (RTO = RTT * 4 = 40 ms)
190 ms (RTO = RTT * 4 = 40 ms)
Peer disconnected due to timeoutMaximum at 200 ms

Instead, I see the following behavior:

0 ms (initial transmission)
10 ms (RTO = RTT * 1 = 10 ms)
30 ms (RTO = RTT * 2 = 20 ms)
70 ms (RTO = RTT * 4 = 40 ms)
150 ms (RTO = RTT * 8 = 80 ms) <- RTO scaled in excess of timeoutLimit
Peer disconnected due to timeoutMaximum at 200 ms

While the behavior doesn't differ as much in this basic situation, it can be quite detrimental in environments with high latency where the additional exponential backoff steps can start to reach into multiple seconds between retransmissions if left alone. In this case, I would prefer to limit the backoff at 2x or 4x RTT to avoid excessive RTO latency in the face of packet loss with high RTT variance at the cost of perhaps slightly more retransmissions than necessary. It appears that enet_peer_timeout() was designed to do this, but it doesn't currently work as intended.

The text was updated successfully, but these errors were encountered:

cgutman linked a pull request May 14, 2021 that will close this issue

fix RTO limit being exceeded during exponential backoff #160

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enet_peer_timeout() appears to have incorrect behavior for timeoutLimit parameter #159

enet_peer_timeout() appears to have incorrect behavior for timeoutLimit parameter #159

cgutman commented May 12, 2021

enet_peer_timeout() appears to have incorrect behavior for timeoutLimit parameter #159

enet_peer_timeout() appears to have incorrect behavior for timeoutLimit parameter #159

Comments

cgutman commented May 12, 2021