Batch ensembles and orchestrators do not have stop files written out for each member/shard #555

AlyssaCote · 2024-04-18T01:03:48Z

Description

Currently the dashboard reads stop.json files for each shard in an orchestrator and each member in an ensemble to come up with an overall status for that entity. When orchestrators and ensembles are run as a batch, only the last member/shard gets a stop.json written out to it. That means that that entity will be seen as Running forever in the dashboard.

How to reproduce

steps to reproduce the bug
Run a driver script with an ensemble where batch_setting= your_favorite_batch_settings and an orchestrator where single_cmd=False and batch=True.

Expected behavior

My hope is that we can get stop.json files written out to each member or shard, even in a batch context.

System

Anywhere you can run batch ensembles and orchestrators.

The text was updated successfully, but these errors were encountered:

AlyssaCote added the type: bug General bug tag before severity classification label Apr 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch ensembles and orchestrators do not have stop files written out for each member/shard #555

Batch ensembles and orchestrators do not have stop files written out for each member/shard #555

AlyssaCote commented Apr 18, 2024

Batch ensembles and orchestrators do not have stop files written out for each member/shard #555

Batch ensembles and orchestrators do not have stop files written out for each member/shard #555

Comments

AlyssaCote commented Apr 18, 2024

Description

How to reproduce

Expected behavior

System