-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Random outbound connection timeouts based on server load #587
Comments
I reported in |
@djs55 Contrary to the title of this issue, this is not actually random at all, is very reproducible and has nothing to do with server load. Because of this our application no longer works with any version above docker desktop 4.5.0 on Mac and 4.5.1 on Windows. We now have to force all customers to downgrade. This is rather serious for us. We will share a reproduction soon. EDIT: We might open a new issue since this one is so unspecific and off from the actual issue. |
I have also seen this issue in versions >4.5.1 including the latest version, but have found that it can also be triggered with low amounts of traffic. The following is how I have been able to reproduce the issue.
sessions.py
As has been mentioned before if I look at a trace from the containers point of view I only see TCP SYNs being sent out during the 4th attempt after waiting 420s since the last request. Also if I kill the vpnkit while it is still trying the 4th attempt then when the vpnkit starts back up the 4th requests is able to complete successfully. Some things that I have noticed that I do not think were previously mentioned. If I look at a trace from the host I see the TCP SYNs going out and TCP SYN ACKs coming back from the server, but these are not passed on to the container. If I start up another container while the first is trying unsuccessfully to do the 4th attempt it also is not able to reach the same destination, but is able to reach other destinations.
The cause of the issue seems to have something to do with using sessions and having a client side keep alive interval being >=60s. If I change to a 30s client keep alive interval I do not run into the issue.
sessions-ka30.py
I hope this information helps in resolving the issue or provides a work around for others experiencing it. I have also added this information to https://github.com/docker/for-win/issues/8861 |
It´s a issue opened 23 June and it remains unsolved. |
I'm also running into this on Mac. The problem gets progressively more frequent until the Docker Desktop process (and thus vpnkit) is restarted. I've gone back to Docker Desktop 4.5.0 for the time being. Would really like to see this resolved so we can begin upgrading Docker again. |
This issue was fixed for me on MacOS after editing |
I was having this issue on linux server (not Docker Desktop) and it was fixed by changing network mode to host. But I see this as workaround, because we expect it to work normally with vpnkit and bridge network mode as well. I am still waiting for some update and fix about this issue. |
I can also confirm that since Docker 4.6 I'm having the same issues. |
Sorry for the unsolicited tag, @djs55 and @avsm, but I was hoping to get some visibility on this. It's affecting lots of Docker Desktop users, who are sticking with 4.5 (February 2022) for now as a workaround. There are repro steps here and in the linked issue (docker/for-win#8861), but I haven't seen any acknowledgement that the VPNKit team is aware of the issue. Apologies if I missed it! |
This is still an issue. Worsening timeouts in Docker containers after they've been running for a few days. This bug is affecting countless developers across all fields, and should be prioritized. |
We have the same issue, is there any update on this? It is very critical bug and still isn't resolved |
I've got an experimental developer build which might help. If you'd like to try it, it's here: |
hello @djs55, thank you, i will try and inform you Best Regards |
@djs55 when can we expect for release version of this build? I see that latest build version is 4.19 does not have this fixed |
This remains to be an issue on my end. Running docker in WSL2 or old-school using Hyper-V on Windows 11. Version 4.19 didn't resolve the problem. Currently on version 4.20.1 (Build 110738) and the frequency of outbound connection timeouts (including DNS queries to various DNS servers) seems to have increased. Any updates to this will be very appreciated. I am having to restart the docker instance every couple of hours and this is very inconvenient. Edit: Adding some more info on fixes I tried so far:
|
Hello,
We've been experiencing random outbound connection timeouts based on server load for a very long time. After restarting the server, the problems go away, but after a while the timeouts start again. After some research I found these issues related to this topic:
docker/for-win#8861
docker/for-mac#3448
docker/for-mac#6086
docker/for-win#12671
docker/for-win#12761
I didn't put the technical logs here because these issues contain the relevant logs and results that I already have.
Anyone else having similar issues? And how can we fix it, any suggestions?
Thank you.
The text was updated successfully, but these errors were encountered: