You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We've seen report ragarding pipeline stuck after error thrown in @Setup method in transform. It is suspected the cause attributed to CancellableQueue in Java SDK harness has the following racing condition. Sympom:
WARNING 2024-09-13T17:15:45.961Z Operation ongoing in bundle process_bundle-3678134576214540161-11096 for at least 06h56m00s without outputting or completing: at
[email protected]/jdk.internal.misc.Unsafe.park(Native Method) at [email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:341) at
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(AbstractQueuedSynchronizer.java:506) at
[email protected]/java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:3465) at [email protected]/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3436) at
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1623) at
app//org.apache.beam.sdk.fn.CancellableQueue.take(CancellableQueue.java:95) at
app//org.apache.beam.sdk.fn.data.BeamFnDataInboundObserver.awaitCompletion(BeamFnDataInboundObserver.java:122) at
app//org.apache.beam.fn.harness.control.ProcessBundleHandler.processBundle(ProcessBundleHandler.java:550) at
app//org.apache.beam.fn.harness.FnHarness$$Lambda$203/0x00007f77f52f5ef8.apply(Unknown Source) at
app//org.apache.beam.fn.harness.control.BeamFnControlClient.delegateOnInstructionRequestType(BeamFnControlClient.java:150) at
app//org.apache.beam.fn.harness.control.BeamFnControlClient$InboundObserver.lambda$onNext$0(BeamFnControlClient.java:115) at
app//org.apache.beam.fn.harness.control.BeamFnControlClient$InboundObserver$$Lambda$212/0x00007f77f5301138.run(Unknown Source) at ...
What happened?
We've seen report ragarding pipeline stuck after error thrown in
@Setup
method in transform. It is suspected the cause attributed to CancellableQueue in Java SDK harness has the following racing condition. Sympom:Some Exception happened, cancel() called:
beam/sdks/java/core/src/main/java/org/apache/beam/sdk/fn/CancellableQueue.java
Line 123 in aeead3f
Further invocation is supposed to raise exception:
beam/sdks/java/core/src/main/java/org/apache/beam/sdk/fn/CancellableQueue.java
Line 97 in aeead3f
However, if in between cancel() and the next invocation, reset() is called,
beam/sdks/java/core/src/main/java/org/apache/beam/sdk/fn/CancellableQueue.java
Line 143 in aeead3f
exception will set to null, and runner does not know the bad status, and just waiting for elements which will never come in.
This affects Java portable runners including Dataflow runner v2.
Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
Issue Components
The text was updated successfully, but these errors were encountered: