You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
FWIW, We experienced an issue where an executor task went lost while the scheduler was also down, and we were unable to recover gracefully from this. I assume this is due to the executor healthcheck being done by the scheduler directly and not by Mesos.
@justinclayton Thanks Justin. We've just had a similar issue reported with #550. We can't decide whether this is our responsibility. I.e. whether we should add more code to work around the fact that Mesos can't handle this failure.
The healthcheck sends a custom framework message to request a status update. We might be able to use the mesos method reconcileTasks().
The text was updated successfully, but these errors were encountered: