OneForAll strategy #11

adinapoli · 2016-02-01T09:45:21Z

No description provided.

srijs · 2016-02-02T07:36:19Z

So I've been looking into implementing this, and what gives me trouble is that RestartStrategy bundles both the RetryStatus and the actual strategy (e.g. OneForOne, etc.). As a consequence, if I understand this correctly (please correct me if I'm wrong), the strategy on how to restart is actually linked to the individual child process.

If you look at how Erlang implements it, you'll see that the strategy with which to restart children is on a supervisor level, and retry state is maintained on a per-child basis. i.e. if a child terminates, we want to be looking at what the strategy of the supervisor is to deal with this, rather than what strategy is linked to the child that died.

The changes necessary to implement this (from my perspective) would be that Supervisor_ would carry the RestartStrategy, instead of Child_. I would suggest to move the RetryStatus from RestartStrategy into Child_ as well.

Since those would be breaking changes (and we would necessarily also need to change the way a supervisor is constructed), could you have a look @adinapoli and tell me if there's a simpler option in your mind?

adinapoli · 2016-02-02T09:56:10Z

Hey @srijs ! I will look at this during off hours (it's quite unfortunate that during our timezones your hacking time matches my working hours 😉 ).

What I can tell you, on top of my mind, after months of not touching this code, is (I will refine this later once I refresh my brain):

Surely it would be nice to stick to what Erlang does, as it's a proved model. I'm not an Erlang programmer and I needed only one kind of strategy, so I implemented only what I needed based on a superficial understanding of how Erlang Supervisors work.
I'm not too much concerned about breaking changes; I'm much more compelled with having the library in a healthier, more active, more maintained, more used state. Nobody really cares about breaking changes if nobody is using your library 😉
I will add an extra comment once I read the code on what I think about the issue you mentioned, and what was my rational behind the original implementation.

srijs · 2016-02-02T10:27:13Z

Thanks for taking the time to comment. I like your attitude :)

I'm looking forward to hearing more from you later. In the meantime, I'll try to open a PR with the changes as I would propose them. It is almost always better to be able to talk about something that is concrete :)

adinapoli · 2016-02-02T10:30:37Z

No problem at all! Thanks for contributing, it's really nice to see this library moving forward 😉

Later!

srijs · 2016-02-07T03:13:44Z

Thinking further regarding this feature, I think we might need to introduce something like "epochs" for the children of a supervisor and the events that are emitted from it. Here's the problem that it would be solving:

Suppose we have a supervisor that has child process which rely on each other. Think of two coroutines that send each other messages, but fails if they can't reach each other.
Now one of those children crashes for some reason, and following the OneForAll strategy, the supervisor proceeds to terminate all other processes.
In the meantime, the other child was not able to send a message to the child that crashed, and thus terminates as well. When it does, a new DeadLetter is queued up.
The supervisor has terminated all processes, and now proceeds to bring them up again.
After all children are restarted, the next iteration of handleEvents starts, picking up on the DeadLetter that has happened before the restart.
All children are terminated again (not what we want).

By introducing epochs, each supervisor would track an increasing number (the epoch) for each child. Whenever it decides to restart (or terminate) a child, it increases the epoch for it. DeadLetter messages with an epoch lower than the currently tracked one are ignored.

What do you think, @adinapoli?

adinapoli · 2016-02-08T07:28:59Z

Hey @srijs !

I think that what you are proposing makes a great deal of sense, but just to play the devil's advocate a bit, are you aware if the Erlang OTP does something similar or it's a novel approach? Generally speaking might be good to see what they do to rely on proved, battle-tested models!

Said that, I have also jotted down a couple of caffeine-deprived thoughts on the shape of the library here:

#15 (comment)

Please chime in 😉

adinapoli assigned srijs Feb 1, 2016

srijs mentioned this issue Feb 2, 2016

refactor: move restart strategy into supervisor #13

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OneForAll strategy #11

OneForAll strategy #11

adinapoli commented Feb 1, 2016

srijs commented Feb 2, 2016

adinapoli commented Feb 2, 2016

srijs commented Feb 2, 2016

adinapoli commented Feb 2, 2016

srijs commented Feb 7, 2016

adinapoli commented Feb 8, 2016

OneForAll strategy #11

OneForAll strategy #11

Comments

adinapoli commented Feb 1, 2016

srijs commented Feb 2, 2016

adinapoli commented Feb 2, 2016

srijs commented Feb 2, 2016

adinapoli commented Feb 2, 2016

srijs commented Feb 7, 2016

adinapoli commented Feb 8, 2016