You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are having issues within core Java when the JVM tries to reach a safepoint. We experience a delay of 104-108 seconds, where the JVM is paused when trying to reach the safepoint. Our logs show certain threads were in a monitor blocked state during synchronization of the sun.net.www.http.KeepAliveCache.get(URL, Object) method.
Lately we have experienced the issue to happen multiple times per week.
To Reproduce
We have not been able to reproduce the safepoint pause in a test environment.
Expected behavior
Time to reach safepoint should be in milliseconds, not seconds.
I can see you have an incredibly long time-to-safepoint (sync phase), 108196 milliseconds. However, I don't see the connection to the blocked threads you've shared. Blocked threads are waiting on a monitor, which does not stop the thread coming to a safepoint. In fact, blocked threads are already at a safepoint, so they do not contribute to time-to-safepoint delays.
Only a running thread will delay reaching a safepoint, in normal circumstances. It's also possible that something outside the JVM is causing a pause, for example something at the OS level, and it just seems like it's related to safepoints because that's where the metrics are.
If it is truly a safepoint issue, I suggest two techniques for diagnosis.
Use these debug flags to print the thread(s) which are slow to reach the safepoint:
Use https://github.com/async-profiler/async-profiler with --ttsp (time-to-safepoint) profiling mode enabled. This will show you what activity is happening in your process while waiting for all threads to reach a safepoint. If async-profiler doesn't show any/many samples during that time, it's an indication that this is something happening external to the java process.
In jdk8, you still can use -XX:+PrintSafepointStatistics to dump TTSP. Like @olivergillespie pointed out, if you do encounter SafePoint Timeout, hotspot will dump whereabout. that will help you find the culprit.
Besides external reasons, jdk8u has a defect in counted loops. If your program has a counted loop and it takes very long time, the thread may hinder other java threads from reaching SafePoint and trigger timeout. In this case, you need to use -XX:+UseCountedLoopSafepoints or refactor your code.
Describe the bug
We are having issues within core Java when the JVM tries to reach a safepoint. We experience a delay of 104-108 seconds, where the JVM is paused when trying to reach the safepoint. Our logs show certain threads were in a monitor blocked state during synchronization of the
sun.net.www.http.KeepAliveCache.get(URL, Object)
method.Lately we have experienced the issue to happen multiple times per week.
To Reproduce
We have not been able to reproduce the safepoint pause in a test environment.
Expected behavior
Time to reach safepoint should be in milliseconds, not seconds.
Logs
Safepoint log:
Mar 28 08:05:16 server: vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
Mar 28 08:05:16 server: 96773.234: ForceAsyncSafepoint [ 377 0 2 ] [ 0 0108196 0 0 ] 0
Flight Recorder Monitor Blocked state:
With Flight Recorder activated the following threads is Monitor Blocked state during the incident:
Thread Stack Trace
Based on the stack traces it looks like an issue with HttpClient at the syncronized
sun.net.www.http.KeepAliveCache.get(URL, Object)
.Example 1:
HttpClient sun.net.www.http.KeepAliveCache.get(URL, Object) HttpClient sun.net.www.protocol.https.HttpsClient.New(SSLSocketFactory, URL, HostnameVerifier, Proxy, boolean, int, HttpURLConnection) HttpClient sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(URL, Proxy, int) void sun.net.www.protocol.http.HttpURLConnection.plainConnect0() void sun.net.www.protocol.http.HttpURLConnection.plainConnect() void sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect() InputStream sun.net.www.protocol.http.HttpURLConnection.getInputStream0() InputStream sun.net.www.protocol.http.HttpURLConnection.getInputStream() int java.net.HttpURLConnection.getResponseCode() int sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode() ...
Example 2:
HttpClient sun.net.www.http.KeepAliveCache.get(URL, Object) HttpClient sun.net.www.http.HttpClient.New(URL, Proxy, int, boolean, HttpURLConnection) HttpClient sun.net.www.http.HttpClient.New(URL, Proxy, int, HttpURLConnection) HttpClient sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(URL, Proxy, int) void sun.net.www.protocol.http.HttpURLConnection.plainConnect0() void sun.net.www.protocol.http.HttpURLConnection.plainConnect() void sun.net.www.protocol.http.HttpURLConnection.connect() OutputStream sun.net.www.protocol.http.HttpURLConnection.getOutputStream0() OutputStream sun.net.www.protocol.http.HttpURLConnection.getOutputStream() ....
Platform information
AWS Corretto: 8.352.08.1
OS: Amazon Linux 2
Kernel: Linux 4.14.294-220.533.amzn2.x86_64
The text was updated successfully, but these errors were encountered: