Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime: Allow a step to execute only when all upstream steps have completed #850

Open
josephjclark opened this issue Jan 3, 2025 · 1 comment

Comments

@josephjclark
Copy link
Collaborator

When a step completes right now, the runtime looks to see if the step has any next (or downstream) steps. If so, the downstream step will be added to the queue to be executed immediately. If a step has multiple upstream edges, it'll run multiple times (after each one completes)

Basically if a node has upstream edges, they are treated as logical ORs. When each upstream step completes, the step will be re-executed.

Image

This is often useful, but we also want to support a mode where a step will not execute until ALL upstream steps have completed. Like a logical AND.

See https://community.openfn.org/t/allow-a-step-to-run-only-when-all-upstream-ancestor-steps-have-run/738

Things to consider:

  • The runtime needs to be more aware of the hierarchy of steps. A step cannot be executed unless all upstream edges have been tested (or all upstream branches have been executed)
  • In other words, a step has dependencies now and cannot run until all dependencies have had a chance to run. Does this mean looking ahead in the queue to see if any upstream (including indirect upstream) steps are waiting? And then defer to the back of the queue? I think so - but it may be more complex than this
  • Do we toggle this behaviour on the edge, node, or global? Does it make sense that some branches are ORs and some are ANDs? I kind of hope not because that's over complicated and hard to visually explain.
  • How to reconcile state. Three upstream steps will have three different state objects. What state does the downstream step receive? We should have a shallow first-to-last merge - just squash it all down - by default. But we also need to enable a reconcile function which takes all state objects as arguments and returns a single state.
  • Don't get blocked if some upstream steps don't execute. The runtime needs to know if all upstream edges have had a chance to run, and when they've all been tried, we can run the downstream step.
  • In other words, if two upstreams steps say "execute x" and one upstream step says "don't execute x", who wins? I'd suggest that as soon as any step allows step x to run, then step x MUST run. We must just wait for any other ancestors to run first.
  • Remember that when referring to "upstream" steps, the upstream step may be indirect. Consider the whole branch.
  • Instead of a reconcile function, should we instead have a reconcile strategy, deep vs shallow? If deep, then we'll recursively traverse all state objects and arrays and merge them. Otherwise we just spread/assign keys at the top level.
@github-project-automation github-project-automation bot moved this to New Issues in v2 Jan 3, 2025
@josephjclark
Copy link
Collaborator Author

Something is noodling me. Is there a different version of this where you set the state behaviour to be shared or branched?

Branched mode is what we do now: all edges are ORs and each branch creates a unique slice of state which cannot be read by other steps.

In Shared mode, the state object is global - shared by all steps. In other words, when a step runs, it sees the sum/merged state of all the steps that have run before it. Whenever multiple branches converge, a reconcile function is needed.

But shared state doesn't imply that edges should be logical AND or OR. And shared state might give you sequencing problems if a step at the bottom of the workflow makes assumptions that upstream steps have run and modified state; or if one branch assumes another branch has executed. The execution pipeline doesn't give you much control of this stuff, and certainly doesn't tell you what has run. So if your workflow makes assumptions about state, the diagram must be structured correctly. And shared state would enable this to be subtly violated and will cause hard to debug problems.

So no, this is not the answer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: New Issues
Development

No branches or pull requests

1 participant