Drain SinkManyEmitterProcessor buffer after cancel #3789

bajibalu · 2024-04-18T08:40:21Z

The internal buffer/queue in SinkManyEmitterProcessor will be drained after all the subscriptions are canceled.

As explained here the queue/buffer in SinkManyEmitterProcessor is not drained properly after the last subscriber canceled the subscription. This was happening due to the WIP marker is left in an unclean state. This PR fixes the issue by updating the WIP marker. I am not sure if this is the ideal approach though.

Fixes #3715

bajibalu · 2024-04-18T08:48:14Z

reactor-core/src/main/java/reactor/core/publisher/SinkManyEmitterProcessor.java

@@ -383,10 +383,12 @@ public Object scanUnsafe(Attr key) {
 	}

 	final void drain() {
-		if (WIP.getAndIncrement(this) != 0) {
+		if (WIP.get(this) != 0) {


In the cases where more than 1 thread calling the drain function, all the threads are incrementing the WIP counter while only one of them proceeds to clear the queue and decrements the counter. This leaves the WIP in unclean state.

By separating the atomic operation into two operations you break the exclusive access to the critical section below. Please start with creating JCStress tests if you're considering to spend more time on this. These drain methods are quite a critical piece and ensuring proper lock-free coordination between different actors is essential. Another aspect is performance so that we make as little volatile accesses as we can.

bajibalu · 2024-04-18T08:49:07Z

reactor-core/src/main/java/reactor/core/publisher/SinkManyEmitterProcessor.java

@@ -398,6 +400,7 @@ final void drain() {
 			boolean empty = q == null || q.isEmpty();

 			if (checkTerminated(d, empty)) {
+				WIP.addAndGet(this, -missed);


Similarly, these returns also leave the WIP in an unclean state.

chemicL · 2024-04-18T13:51:30Z

reactor-core/src/main/java/reactor/core/publisher/SinkManyEmitterProcessor.java

@@ -383,10 +383,12 @@ public Object scanUnsafe(Attr key) {
 	}

 	final void drain() {
-		if (WIP.getAndIncrement(this) != 0) {
+		if (WIP.get(this) != 0) {


By separating the atomic operation into two operations you break the exclusive access to the critical section below. Please start with creating JCStress tests if you're considering to spend more time on this. These drain methods are quite a critical piece and ensuring proper lock-free coordination between different actors is essential. Another aspect is performance so that we make as little volatile accesses as we can.

bajibalu · 2024-04-27T00:46:16Z

@chemicL I addressed your comments and also added a JCStress test. But I am not sure whether the test case is correct.

chemicL · 2024-07-29T09:23:44Z

Hey. Thank you for the effort. Your PR has allowed to unravel some hidden complexities and is pushing the research into this space forward. Unfortunately, I'm not able to accept this in the current form. There are multiple issues here and it would require quite an effort on my part to help correct these. The issues I can see are:

Consider reviewing how the concept of WIP (work-in-progress) is used in the codebase -> this is a mechanism to signal a Thread that is currently draining that more work needs to be considered and also as means to guard access to the critical section in a lock-free fashion
in case of cancellation, the return value can't be "OK", but rather "FAIL_CANCELLED"
the JCStress test should not consider an item stuck in the queue forever as "acceptable, interesting" but rather fail the evaluation

In general, I think the SinkManyEmitterProcessor is a super critical piece in the Sinks API and should be handled with care. The attempt to fix it should consider a broad set of circumstances and would require means to atomically represent the cancelled state and a way to reject new additions and ensure proper cleanup in the face of high concurrency. With that, I'm sorry to reject the PR.

bajibalu requested a review from a team as a code owner April 18, 2024 08:40

bajibalu mentioned this pull request Apr 18, 2024

Auto-cancelled Sink still accepts emissions #3715

Open

bajibalu commented Apr 18, 2024

View reviewed changes

bajibalu force-pushed the auto-cancel-sink branch from 3ad4a0b to 251d759 Compare April 18, 2024 08:51

chemicL requested changes Apr 18, 2024

View reviewed changes

Drain SinkManyEmitterProcessor buffer after cancel

d7e3d7d

bajibalu force-pushed the auto-cancel-sink branch from 251d759 to d7e3d7d Compare April 18, 2024 14:28

Added JSStress test

069b7d4

chemicL closed this Jul 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Drain SinkManyEmitterProcessor buffer after cancel #3789

Drain SinkManyEmitterProcessor buffer after cancel #3789

bajibalu commented Apr 18, 2024

bajibalu Apr 18, 2024

chemicL Apr 18, 2024

bajibalu Apr 18, 2024

chemicL Apr 18, 2024

bajibalu commented Apr 27, 2024 •

edited

Loading

chemicL commented Jul 29, 2024 •

edited

Loading

Drain SinkManyEmitterProcessor buffer after cancel #3789

Drain SinkManyEmitterProcessor buffer after cancel #3789

Conversation

bajibalu commented Apr 18, 2024

bajibalu Apr 18, 2024

Choose a reason for hiding this comment

chemicL Apr 18, 2024

Choose a reason for hiding this comment

bajibalu Apr 18, 2024

Choose a reason for hiding this comment

chemicL Apr 18, 2024

Choose a reason for hiding this comment

bajibalu commented Apr 27, 2024 • edited Loading

chemicL commented Jul 29, 2024 • edited Loading

bajibalu commented Apr 27, 2024 •

edited

Loading

chemicL commented Jul 29, 2024 •

edited

Loading