-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ruby phase 2 - Workflows #96
base: master
Are you sure you want to change the base?
Conversation
* `Temporalio::Workflow` is both the class to extend and the class with class methods for calling statically to do | ||
things. | ||
* The `workflow_` class methods on the `Temporalio::Workflow` can only be called at class definition time and cannot | ||
be called on `Temporalio::Workflow` directly. These are for adjusting the definition. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When a user writing client code writes something like
supported_languages = wf_handle.query(
GreetingWorkflow.<TAB>
will their IDE show them all the "framework" methods (including the unavailable methods starting workflow_
)? It is nice that in other languages the user's IDE only offers the methods from their own workflow interface. I think we would like to have the same be true for Ruby if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I am afraid this is only available at runtime (via method_missing
). Ruby does not allow you to access an instance method at the class level.
I think we would like to have the same be true for Ruby if possible.
Completely agree, but I cannot think of a way to do this without some form of proxying which were trying to avoid for a few reasons (mostly because we'd have to lie about what the type actually is). And even if we did do proxying, Ruby is often not expressive enough to represent this well statically. Here may be some other options, but all have more tradeoffs than the current approach I think.
class Workflow
# @return [self]
def self.stub
StubDelegator.new(self)
end
end
# This if you want to keep accurate with typing, but no clear way to provide
# handle and args
my_wf = MyWorkflow.stub
my_wf.update('my arg')
# This if you want better options, but you'd fail a type checker
my_wf_handle.update(MyWorkflow.stub.update, 'my arg')
# You could use blocks, but it gets confusing for users too
my_wf_handle.update(wait_for_stage: whatever) { |stub| stub.update('my arg') }
Open to suggestions here, but I fear Ruby just cannot statically express things to the IDE like this. Ruby developers are used to having to use symbols or runtime methods that are not in their IDE. Of course a user can provide YARD macros or Sorbet/RBI types to support this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've probably already implicitly explained this, but in Python this is done by implementing static things like @query
decorators as functions on a workflow
module. Can you (re-)summarize why an approach like that doesn't work for Ruby?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Python you can reference an instance method via Class.instance_method
, but in Ruby you cannot. You must have an instance of a class to reference an instance method on it. FooWorkflow.my_query_method
will give a NoMethodError
. You can do hacky things like FooWorkflow.instance_method(:my_query_method)
to get an unbounde method, but you're into reflection there not IDE help.
We will support FooWorkflow.my_query_method
via method_missing
approaches, but that is runtime, not definition time like IDEs need. But I'm open to ideas here. There are concepts of "stubs" we have used in Java, but they have many of their own problems.
I am investigating https://github.com/Shopify/tapioca#writing-custom-dsl-compilers to see if we may be able to write a custom DSL compiler/extension to help here.
french: 'Bonjour, monde', | ||
hindi: 'नमस्ते दुनिया', | ||
portuguese: 'Olá mundo', | ||
spanish: '¡Hola mundo' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It bugged me:
spanish: '¡Hola mundo' | |
spanish: '¡Hola mundo!' |
None of the others have an exclamation mark though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 https://github.com/temporalio/samples-python/blob/171b5e5b205167fdff4231978857c4efe1cd6225/message_passing/introduction/activities.py#L22 (where I sourced this) may also need this fix, cc @dandavison.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! temporalio/samples-python#149
I removed the exclamation marks from all of them after I realized I had it on the wrong end for Arabic... but apparently I missed the leading one in Spanish.
* Notice lack of `timeout` on `wait_condition`, we will support Ruby `Timeout` via the fiber scheduler. | ||
* Notice lack of `sleep`, we will support Ruby `sleep` via the fiber scheduler. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I disagree with these. I thought we were gaining some consensus that the "magic" patched APIs aren't terribly helpful long-run and end up confusing users rather than teaching them that workflows have special constraints. Inevitably users need to learn this, so why delay that? If the main counter argument is that you can re-use some existing code, I know I've heard agreement from you and others before that that's very rare to actually be viable.
I think it'd be good to instead error on these calls and suggest our provided alternatives instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrmm. Ruby explicitly delegates these two things to the scheduler: https://docs.ruby-lang.org/en/master/Fiber/Scheduler.html#method-i-timeout_after and https://docs.ruby-lang.org/en/master/Fiber/Scheduler.html#method-i-kernel_sleep.
The concern with erroring on these is that we are probably going to recommend use of the https://github.com/socketry/async library (and definitely heavily test for its use) because Ruby doesn't have good async primitives and that library leans on existing fibers. So users expect to be able to call https://socketry.github.io/async/source/Async/Task/index.html#Async::Task#with_timeout and https://socketry.github.io/async/source/Async/Task/index.html#Async::Task#sleep (now deprecated because they too encourage stdlib sleep).
This is the same type of thing with Queue
where Ruby built in support for customized async inside it so people can use it without concern. I don't expect us to say "queue.pop confuses users and we should have our own queue" either.
In this case, I don't think these are considered "magic patched", I think it's expected by users that they work.
If the main counter argument is that you can re-use some existing code, I know I've heard agreement from you and others before that that's very rare to actually be viable.
I think that's true elsewhere, but in Ruby, the lack of async primitives means we have to either write a ton of them, or allow use of the common async
library built on the same fiber abstraction we're building on (almost all of that library's primitives will work based on research so far). But open to options here. I think we need to pick between:
- Allow/encourage use of https://github.com/socketry/async
- Write our own primitives (e.g. Mutex and Notification and such) and disallow/discourage use of that library
We already have the basic future primitive, so technically the second option is viable, but the proposal currently is expecting the first. (pardon the long response)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, yeah I can buy the argument about socketry using these under the hood.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #96 (comment), maybe we don't want to encourage this async
lib after all. It does put too much trust in their deterministic fiber scheduler use (even across versions).
* ❓ Is this acceptable? This is a consequence of us wanting to share workflow context class with workflow base | ||
class. Basically at the top of these calls there's a `if self != Workflow` check. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's pretty much fine, but... why not just have a WorkflowContext
class and be explicit? Or, have that be Workflow
and the other be WorkflowDefinition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is definitely a good question and subject to debate and is the reason I put "Temporalio::Workflow
is both the class to extend and the class with class methods for calling statically to do things." at the top of the controversial decisions.
So basically you have these options:
Option 1 - Together
class MyWorkflow < Temporalio::Workflow
def execute
Temporalio::Workflow.wait_condition { false }
end
end
Option 2 - Context class named separate
class MyWorkflow < Temporalio::Workflow
def execute
Temporalio::Workflow::Context.wait_condition { false }
end
end
One benefit to this is that we do have Temporalio::Activity::Context
that houses all of the activity things, but if you look at some other SDKs like Java and .NET, the class name is "activity context" but not "workflow context", so that disparity does exist elsewhere.
Option 3 - Workflow class named separate
class MyWorkflow < Temporalio::WorkflowImpl
def execute
Temporalio::Workflow.wait_condition { false }
end
end
I am using WorkflowImpl
here because there will be a Temporalio::Workflow::Definition
that represents the static definition built from the impl (and signal, query, and definition classes, same as there is already Temporalio::Activity::Definition
). Open to a better name, I just struggled to come up with one.
Option 4 - Don't even require class extension
class MyWorkflow
def execute
Temporalio::Workflow.wait_condition { false }
end
end
You aren't buying that much with a class extension, but it is clear and is common in Ruby and is what we do for activities.
Thoughts/preference? Any other/better options here? Mixins would suffer the same class-for-two-purposes reuse issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like option 3. Clear separation, just as terse as option 1 for the vast majority of the code..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, which is basically the same as option 2 just different on who gets the "Workflow" part. I will say I like "Context" for users more than I like "WorkflowImpl" for extenders, but I do like callers of workflow utilities to only need Temporalio::Workflow
instead of Temporalio::Workflow::Context
(and what drove my original approach). Either way, I am starting to lean towards separating what you extend from what you invoke. Any ideas for other names though besides either "Context" for workflow utilities or "WorkflowImpl" for the extension class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really lol. Those are pretty obvious choices. Maybe WorkflowDefinition
like I suggested instead of impl. I would err on making that longer since you type it way less.
* 💭 Why? It is helpful for callers/client-side to be able to reference something more explicit than a symbol, and it | ||
also helps with making sure if the name is changed we use that name instead of the method name. Symbols can always | ||
be used though. | ||
* ❓ Is this acceptable or too much magic? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this kind of magic is largely fine because it's not really pretending some behavior is different, it's just advanced sugar for fetching the definition in a nice way. Works for me. The only sort of magic bit I suppose is that, if you call that, it's not invoking the method, but it would seem fairly obviously undesirable to do that anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. It is not very common in Ruby to have class methods and instance methods of the same name with different behavior, though it does happen here and there. But I think our use case uniquely justifies it.
The bigger struggle is type safety integration, but Ruby type safety is in such poor state I 1) cannot find a good way to support caller-side type safety in RBS/Sorbet (unless relying on something like this which doesn't work in IDEs anyways), and 2) do not want to degrade ease of use by the majority of users using this as normal Ruby without type safety checks. Users can of course write their own signatures for these things, e.g. in RBS you might have:
class MyWorkflow < Temporalio::Workflow
def self.my_update: -> Temporalio::Workflow::Definition::Update[MyReturnType, MyArgType]
def my_update: (MyArgType arg) -> MyReturnType
end
I am open to any/all designs here. Requirements for others reading:
- Workflow handlers for signal, query, and update must be easily implemented as instance methods
- Workflow handlers for signal, query, and update must be easily usable by clients without having an instance of the workflow (we don't want fake/proxy objects where method invocation does different things)
* Create a new `Temporalio::Converters::RawValue` object that is a pass-through object for raw values (still subject to | ||
payload codecs), and a `payload_converter` is available in workflow/activity context to convert it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Python and .NET have these and they are very valuable (even for people not using dynamic, but just want to defer/control conversion for whatever reason)
def self.any_of_no_raise(*futures) | ||
# Returns future whose result is set to nil when all futures complete | ||
# regardless of whether any future failed. | ||
def self.all_of_no_raise(*futures) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these are confusing. The wait
and wait_no_raise
method accomplishes the waiting & maybe raising part just fine.
From what I understand these are really more like different conditions for when any/all resolve. The first is more like any_success
and the second is really all_of
, where the current all_of
is really try_all_of
or something. IE: It looks like the originals short circuit on any failure, and these versions do not.
That or I'm not understanding them, because what's the difference between any_of_no_raise(...).wait
and any_of_no_raise(...).wait_no_raise
? The latter reads particularly strangely. Either way if I'm wrong or right, I think it says the names are a bit confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because what's the difference between
any_of_no_raise(...).wait
andany_of_no_raise(...).wait_no_raise
?
No big difference if you have both of these calls together. But they have value independently (the first one returns a future the second you have chosen not to even be able to look at the failure) and have value as consistent shortcuts. I can't imagine anyone would call the latter combo, wait_no_raise
will more often be used for people using singular futures they don't care about.
Most languages have forms of these. But I am open to better names though don't want to get too inconsistent with each other and don't want these to be arguments since they change the return type. Basic requirements I think are:
- Ability to wait for first of multiple futures or raise on first failure
- Ability to wait for first of multiple futures but do not raise on first failure
- Ability to wait on all of multiple futures or raise on first failure
- Ability to wait on all of multiple futures but do not raise on first failure
- Ability to wait on a future or raise on failure
- Ability to wait on a future and not raise on failure
This was my best attempt after many permutations to get those 6 methods in. Definitely open to clearer suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I would then change the names a-la my suggestions - the most important part I think is just not to use no_raise
in these names since it's not really about not raising, it's about not short-circuiting in the event one fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I would then change the names a-la my suggestions
Do you have a specific suggestion on names? So instead of:
any_of
any_of_no_raise
all_of
all_of_no_raise
wait
wait_no_raise
reading comment above, something like:
any_of
any_of_success
- says success but includes failure?try_all_of
- fails future if any fail?all_of
- does not fail future if any fail, so one "x_of" doesn't ail the future and one does?wait
wait_no_raise
Maybe something like try
instead of _no_raise
, e.g.
any_of
try_any_of
all_of
try_all_of
wait
try_wait
Looking for symmetry in naming between messages that fail their things and ones that don't
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the try names, but I think they should actually be the short-circuiting versions. IE: Existing any_of
and all_of
become the try_xxx
versions, and the current _no_raise
versions just become any_of
and all_of
. Since "try" implies "try until something breaks". This also matches the Rust naming scheme for these, FWIW.
I would probably keep wait_no_raise
though rather than try
there just because that's more obvious. And, I think, to my point, that is actually about raising or not where the other stuff is about short-circuiting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrmm, to me "try" means failure is captured or doesn't fail as it otherwise would, e.g. .NET int.Parse
vs int.TryParse
. One fails as normal, another captures failure (same as the actual try
statement in many languages).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The try
here is like "try them, but fail if any fails soon" whereas without it is "definitely do all the futures even if some of them fail".
But, I wouldn't cry about having it the other way around, I think that can work too w/ clear docs. I do think wait
and wait_no_raise
should stay that way though.
* ❓ Is this an acceptable default? Python uses `[Etc.nprocessors, 4].max` as the max threads, but we figure it can | ||
be unbounded here and remain bounded by the Core `max_concurrent_workflow_tasks` (TODO: expose this, it was left | ||
off in `main`). | ||
* ❓ Should the default share the same thread pool as activity executor? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so, specifically because of some of the workflow isolation stuff that may or may not need to happen in them. Seems easier and cleaner to keep them separate without much downside.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a difference between thread pool and executor in this case. One benefit of reusing the thread pool (but still different executors of course) is that it is an "unbounded caching thread pool" which means the same threads could be reused across workflows and activities. The threads, once returned back to the pool, should have no side effects from its use. By having separate unbounded thread pools here, both pools will have idle threads that the other could be using unnecessarily. Definitely not an easy/hard thing to share vs keep separate because we will have lazily created global defaults for these (and already do for activity executor).
So the only argument against I can think of is maybe some kind of concern for user clarity, but my hope is that users never touch this stuff for the most part.
is that not only are Ractors not stable and may have bugs, but they also give off a warning. A workflow instance will be | ||
in a Ractor and Core activations will be communicated. | ||
|
||
* ❓ Should we enable this by default, disable this by default, or somehow force the user to make a choice |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think anyone cares about choosing between these two really. I think we need to figure out what has acceptable performance and correctness and do that. The warning / experimental status for Ractors seems like a nonstarter, though.
But, other than that, I can't really imagine why someone is coming along and deciding "Oh I really need to use Ractors and not TracePoint" for some reason. The only thing they care about is that the workflows run fast & errors get caught.
That is, of course, excepting some other weird interactions that aren't mentioned here or we don't know about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The warning / experimental status for Ractors seems like a nonstarter, though.
I am not sure this is true. The Ractor gives so much benefit to ensuring safe determinism, it's basically a language-supported sandbox. And if we don't enable them by default (or force the user to choose whether to use them for now), we never can enable them by default. And they are quite decent for now for limited uses from my understanding.
I can't really imagine why someone is coming along and deciding "Oh I really need to use Ractors and not TracePoint" for some reason
This section is purely for Ractors, I think we will turn on TracePoint by default.
It is a very tough call here on whether to enable Ractors by default, because Ractors are a perfect fit for this use case and we would support turning them off (at the worker level), and if we have a non-Ractor default now, we will never be able to change to the safer Ractor default if we want to later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we never can enable them by default.
So why is that true?
I think the warning isn't so much the problem as the content of the warning that basically says "these are buggy". If that's not actually true, then cool, I think we can be fine with it, but, we should know that for sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So why is that true?
To clarify I mean "if we want this as default, we have to do the more restrictive default now because we can't later" because you can't go from a less-restrictive default to a more-restrictive default without breaking people. Ractors prevent state sharing (e.g. a global is different in each Ractor). Also, things that are shared across Ractors must be marked as such via https://docs.ruby-lang.org/en/master/Ractor.html#method-c-make_shareable (which is an awesome feature and we already require this be done with payload and failure converters knowing they may be used inside sandbox). So turning on Ractors by default when they weren't before would break a lot of people that were, say, using their logger or OTel impl or other things people sometimes opt out of sandboxes for.
I think the warning isn't so much the problem as the content of the warning that basically says "these are buggy". If that's not actually true, then cool, I think we can be fine with it, but, we should know that for sure.
We definitely have to test here, and they may be buggy for some advanced uses and maybe not our own. But even if that's the case, defaulting to experimental thing is rough and it spits out a warning, even though plenty of Ruby users do use Ractors today. Even if we default, we have to make it very easy to change and document it very clearly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. I see. But arguably they already need to not be doing all that stuff for proper workflows anyway?
I think, if we find that we're not running into implementation bugs, it can be an OK default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. I see. But arguably they already need to not be doing all that stuff for proper workflows anyway?
Usually, yes, but some things people accept they need to cheat (e.g. interceptor sending exception to sentry when not replaying, or a logger that writes to disk but is not the logger we offer)
I think, if we find that we're not running into implementation bugs, it can be an OK default.
I'm leaning there too. But it will be super loud that we use them and how not to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My vote is to force users to explicitly specify whether to use Ractors or not.
If we think we might want to change the default in the future, I think it's fine to require users to explicitly specify it now and remove the requirement later (so long as we're not asking them to do that for too many things). This enables us to migrate to the more restrictive default for future new users without breaking existing users.
We did something similar for Update IIRC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrmm, no SDK requires extra options for running workflows. I think if we choose Ractors we can change the default to be less restrictive if we ever had to (though I don't suspect we ever would).
We did something similar for Update IIRC.
This is not comparable I don't think because we know we will provide a default in the future but it is not yet implemented. Here, we don't know, so this isn't necessarily temporary. I think we might as well decide now, because in 5 years I don't think our decision would change.
I'm leaning towards the safer-though-technically-warns-as-experimental default that is easy to opt out of. Obviously it will be a heavily tested default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
general non-blocking comments, no need to wait for my approval.
* ❓ Why not just take a dependency on `async` gem? It does have a deterministic scheduler and does a decent job of | ||
everything we want, but not only do we want to avoid transitive dependencies in our library (for versioning and | ||
other reasons), we cannot rely on its logic into the future. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is async
s deterministic behavior actually part of its contract or no? (I'm thinking of how we got burned by Python's change in behavior for asyncio; I imagine you are too)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is asyncs deterministic behavior actually part of its contract or no?
Not explicitly, but most of the things it does are delegated to the underlying fiber scheduler. The other problem is not just that it is deterministic, but that it also doesn't make changes that would be incompatible (e.g. schedule two fibers in a different order in a newer version of the library).
Concerning determinism, they do say as a comment inside https://github.com/socketry/async/blob/72fae62658b68e85d53d4c6adf13d3b34108cfe8/lib/async/scheduler.rb#L428-L435 that they intentionally try to preserve determinism, but obviously a code comment is not a clear contract (and that's their scheduler, we'll have our own).
It's a tough call, because it's a full featured library with lots of utilities that leans on the deterministic fiber scheduler, and we surely don't want to recreate every construct they have already created. But maybe we don't want to encourage use of this project? Java SDK gets along just fine with its primitive Promise
support. We may still want to lean on traditional Ruby things like queues, timeouts, and sleep I think, but maybe not (see #96 (comment)). Regardless, we'd have to write our own mutex (we're already writing our own future).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this would be a much easier call if determinism were an explicit design goal, but since it's not as such, I think it's probably safer to avoid the dependency for now. Perhaps we could align our own API surface with its API so we could switch later if there is user demand? That might also help reduce the cognitive overhead of learning Temporal if the user already knows async
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's probably safer to avoid the dependency for now
To clarify, we would never have a dependency, just that it would happen to work and we could recommend its use (and we'd test that it does in our tests). Having said that, I think I agree maybe we should not even recommend its use or discuss it.
Perhaps we could align our own API surface with its API so we could switch later if there is user demand
It's too large. I think we can have Future
(what they call Task
) and Mutex
manually implemented, and then decide whether to explicitly implement sleep
and Timeout
and Queue
or allow reuse of the standard library ones (knowing in Ruby this isn't the same as leaning on a large set of async primitives, these are 3 very limited low-level common Ruby things that will not change behavior).
I think I'm leaning towards no async
library recommendation, Future
+ Mutex
implemented, sleep
+ Timeout
+ Queue
just work from the standard library. We haven't really needed more than this in any other language.
* `Temporalio::Worker::WorkflowTaskExecutor::ThreadPool` that accepts a thread pool which defaults to max threads | ||
unbounded. | ||
* There is a `default` class method with a lazily created global default. | ||
* ❓ Is this an acceptable default? Python uses `[Etc.nprocessors, 4].max` as the max threads, but we figure it can | ||
be unbounded here and remain bounded by the Core `max_concurrent_workflow_tasks` (TODO: expose this, it was left | ||
off in `main`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uhmmm, unbounded thread creation seems scary to me. I think I'm OK leaving it bounded by max_concurrent_workflow_tasks
as long as it's actually designed that way intentionally and it's not just something that happens by accident (e.g. it should be explicit in our code/code comments that this is the intention).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is bounded by that from Core. But that is per worker and the thread pool is shared across workers for obvious benefits. So yes, Core ensures it cannot be more than the sum of max concurrent tasks across all workers using the thread pool.
is that not only are Ractors not stable and may have bugs, but they also give off a warning. A workflow instance will be | ||
in a Ractor and Core activations will be communicated. | ||
|
||
* ❓ Should we enable this by default, disable this by default, or somehow force the user to make a choice |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My vote is to force users to explicitly specify whether to use Ractors or not.
If we think we might want to change the default in the future, I think it's fine to require users to explicitly specify it now and remove the requirement later (so long as we're not asking them to do that for too many things). This enables us to migrate to the more restrictive default for future new users without breaking existing users.
We did something similar for Update IIRC.
Summary
For general discussion beyond comments here, please also join us on
#ruby-sdk
in Slack.