Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch ensembles and orchestrators do not have stop files written out for each member/shard #555

Open
AlyssaCote opened this issue Apr 18, 2024 · 0 comments
Labels
type: bug General bug tag before severity classification

Comments

@AlyssaCote
Copy link
Contributor

Description

Currently the dashboard reads stop.json files for each shard in an orchestrator and each member in an ensemble to come up with an overall status for that entity. When orchestrators and ensembles are run as a batch, only the last member/shard gets a stop.json written out to it. That means that that entity will be seen as Running forever in the dashboard.
batch_orc.png

How to reproduce

steps to reproduce the bug
Run a driver script with an ensemble where batch_setting= your_favorite_batch_settings and an orchestrator where single_cmd=False and batch=True.

Expected behavior

My hope is that we can get stop.json files written out to each member or shard, even in a batch context.

System

Anywhere you can run batch ensembles and orchestrators.

@AlyssaCote AlyssaCote added the type: bug General bug tag before severity classification label Apr 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug General bug tag before severity classification
Projects
None yet
Development

No branches or pull requests

1 participant