-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework AFSelector to normalize and simplify its locking #146
Rework AFSelector to normalize and simplify its locking #146
Conversation
With our PoC at https://github.com/kevink-sq/jetty-concurrency-issue-poc/tree/575469b3453e7b09d9d60f6460cdab8117b8e773, we noticed that there is unbounded growth of file descriptors when running our simple load test: ```bash seq 1 15000 | xargs -P1500 -I{} curl --unix-socket ./junix-ingress.sock http://localhost/process ``` Leading to timeouts from curl and timeout exceptions in jetty: ``` java.util.concurrent.TimeoutException: Idle timeout expired: 32422/30000 ms at org.eclipse.jetty.io.IdleTimeout.checkIdleTimeout(IdleTimeout.java:170) at org.eclipse.jetty.io.IdleTimeout.idleCheck(IdleTimeout.java:112) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:833) ``` This patch fixes the issue by ensuring that keys with the `OP_INVALID` bit are still tracked in the selected key set and cancelled accordingly Changes: * Add invalid keys to `selectedKeysSet` in `setOpsReady` and mark as cancelled, ensuring that the fd is cleaned up timely * Switch from `cancelNoRemove` to `cancel` now that the registered keys are modified separately in the select implementation * Prefer `!key.isValid()` over `key.hasOpInvalid()` since the former is a superset of the latter * In `implCloseSelector`, cancel the keys before clearing `keysRegistered` since `keys()` is backed by `keysRegistered` * Store cancelled keys in a deque instead of a hash set, which should be a little more efficient * Remove dead code
`processDeregisterQueue`, respectively This naming convention more closely aligns with the selector implementations in Java's standard library
head of the strings
* Place bulk of `select0` operations under `this` and `publicSelectedKeys` locks to reduce locking/unlocking thrash given frequency of locking operations * Lock `publicSelectedKeys` instead of `selectedKeys` since that is what clients of AFSelector are locking on * Eschew uses of synchronized methods in favor of locking `this` via synchronized blocks * Use assertions to check lock invariants
529779c
to
1346d2c
Compare
} | ||
} | ||
|
||
private synchronized void consumeAllBytesAfterPoll() throws IOException { | ||
private void consumeAllBytesAfterPoll() throws IOException { | ||
assert Thread.holdsLock(this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be assert Thread.holdsLock(publicSelectedKeys)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case is interesting. It doesn't actually need to lock on publicSelectedKeys
for correctness, but then we would have to lock, drop, then immediately lock on publicSelectedKeys
again if we didn't want to consume bytes under the lock. I think we can reorder some of this to reduce what is under the lock.
@@ -110,47 +111,51 @@ public int select() throws IOException { | |||
|
|||
@SuppressWarnings("PMD.CognitiveComplexity") | |||
private int select0(int timeout) throws IOException { | |||
PollFd pfd; | |||
synchronized (this) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why lock on this
when we now also lock on publicSelectedKeys
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good question. In the standard lib, they lock on both for most of the implementation, but I think we can push down the lock for selected keys to leave initPollFd
only locked under this
.
There's a lot going on in AFSelector for sure, and it's not the most readable code. I'm not sure that this PR resolves this issue, though. As a general rule, try simplifying your change so it's easy to follow what is actually changed. Please do not rename methods/fields just because, it really makes things hard to follow, prevents unrelated changes from merging cleanly, and honestly in this case, it is a matter of consistency and maybe taste, so I don't think we need these particular changes. I can't remember exactly why, but the synchronized-locking was split around the JNI poll method for a reason. That reason may have been wrong in the first place, or things have changed, but I would want to make sure we don't introduce yet another regression. What if we focus the change on locking on |
This feedback is fair.
Okay, I am working on a separate PR with only the locking changes.
This part is odd to me as well. In Java's standard library, calls to native implementations like epoll and kqueue don't drop the locks and, as far as I can tell, it doesn't negatively impact junixsocket either. One thing I did notice is that Java's variants omit the
On it! |
Okay, I took another pass and this PR only adjusts locking: #147 I think it also addresses your comments with locking on |
7791a17
to
5d178b0
Compare
Continuing discussion in #147. |
Note that this PR is split into three commits; the first two commits are mainly renames to make the implementation more consistent with the Java standard library's implementations. For reviewing, strongly recommend ignoring whitespace changes.
Changes:
select0
operations underthis
andpublicSelectedKeys
locks to reduce locking/unlocking thrash givenfrequency of locking operations
publicSelectedKeys
instead ofselectedKeys
since that is whatclients of AFSelector are locking on
this
viasynchronized blocks