Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use partial functions in flows to increase user flexibility in setting job attributes (e.g. executor) #1322

Merged
merged 22 commits into from
Dec 13, 2023

Conversation

Andrew-S-Rosen
Copy link
Member

@Andrew-S-Rosen Andrew-S-Rosen commented Dec 8, 2023

Closes #1320, although I'm not 100% confident that this is the ideal approach yet. I'm not comfortable with merging this without further justification that this is the best mechanism.

Pros:

  • There is now full flexibility in setting the attributes of each job in a workflow, meaning you can customize each executor
  • It is clearer which kwargs are being set since they are associated with the given function
  • The user can even define their own custom job for a flow without needing to create a new flow altogether

Cons:

@Andrew-S-Rosen Andrew-S-Rosen changed the title [WIP] Test for using partial functions Use partial functions in flows to increase user flexibility in setting job attributes (e.g. executor) Dec 9, 2023
Copy link

codecov bot commented Dec 10, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (e7a1f71) 98.90% compared to head (3f5512d) 97.73%.
Report is 5 commits behind head on main.

❗ Current head 3f5512d differs from pull request most recent head 2a073cd. Consider uploading reports for the commit 2a073cd to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1322      +/-   ##
==========================================
- Coverage   98.90%   97.73%   -1.18%     
==========================================
  Files          76       77       +1     
  Lines        2932     2955      +23     
==========================================
- Hits         2900     2888      -12     
- Misses         32       67      +35     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@tomdemeyere
Copy link
Contributor

After looking at the changes I am not sure if this would allow to attach different executors to the various jobs? Running the first one locally and the second one with an HPC executor?

About the first con: I am not sure if it is clunkier, but I would advocate that it is more intuitive. People who are not proficient in Python might find partial functions easier to understand than passing multiple 'kwargs' (they might not know about 'kwargs' as well). Schematically it also makes more sense in my opinion, as it would be easier to explain how flow works by showing that it is built from jobs you send as parameters. You can show a first example where partial is not needed and go on from there for example.

@Andrew-S-Rosen
Copy link
Member Author

Andrew-S-Rosen commented Dec 10, 2023

After looking at the changes I am not sure if this would allow to attach different executors to the various jobs? Running the first one locally and the second one with an HPC executor?

The following works, although it is very clunky:

import covalent as ct
from ase.build import bulk
from quacc.recipes.emt.slabs import bulk_to_slabs_flow
from quacc.recipes.emt.core import static_job
from quacc import job

# Define the Atoms object
atoms = bulk("Cu")

# Modify the static job to run "local" (instead of "Dask")
@job(executor="local")
def slab_static_job(*args, **kwargs):
    return static_job(*args, **kwargs)

# Dispatch the workflow
dispatch_id = ct.dispatch(bulk_to_slabs_flow)(atoms, slab_static_job=slab_static_job)

# Print the results
result = ct.get_result(dispatch_id, wait=True)

The following was supposed to work, but it does not update the executor. I think this may be a Covalent-specific thing. I'll have to revisit this because the approach below (or one similar to it) was what I was hoping to achieve.

import covalent as ct
from ase.build import bulk
from quacc.recipes.emt.slabs import bulk_to_slabs_flow
from quacc.recipes.emt.core import static_job

# Define the Atoms object
atoms = bulk("Cu")

# Modify the static job to run "local" (instead of "Dask")
static_job.electron_object.executor = "local"

# Dispatch the workflow
dispatch_id = ct.dispatch(bulk_to_slabs_flow)(atoms, slab_static_job=static_job)

# Print the results
result = ct.get_result(dispatch_id, wait=True)

About the first con: I am not sure if it is clunkier, but I would advocate that it is more intuitive. People who are not proficient in Python might find partial functions easier to understand than passing multiple 'kwargs' (they might not know about 'kwargs' as well). Schematically it also makes more sense in my opinion, as it would be easier to explain how flow works by showing that it is built from jobs you send as parameters. You can show a first example where partial is not needed and go on from there for example.

Thanks for your input! I greatly appreciate it. " Schematically it also makes more sense in my opinion, as it would be easier to explain how flow works by showing that it is built from jobs you send as parameters." This is a very good point that I had not thought about before, but I believe you're right.

@Andrew-S-Rosen Andrew-S-Rosen added the enhancement New feature or request label Dec 12, 2023
Andrew-S-Rosen added a commit that referenced this pull request Dec 13, 2023
Closes #1092. Prefect is just not suitable with quacc for a host of
reasons. Merge this if we merge in #1322 since that will be the real
nail in the coffin.
@Andrew-S-Rosen Andrew-S-Rosen merged commit 6133849 into main Dec 13, 2023
17 checks passed
@Andrew-S-Rosen Andrew-S-Rosen deleted the slabtest branch December 13, 2023 19:46
@Andrew-S-Rosen Andrew-S-Rosen restored the slabtest branch December 13, 2023 20:19
@Andrew-S-Rosen
Copy link
Member Author

Reverted the merge. Performance issues were found afterwards. Need to continue debugging. Moved to #1359.

Andrew-S-Rosen added a commit that referenced this pull request Dec 18, 2023
…g job attributes (e.g. executor) (#1359)

Continuation of #1322.

Challenges:
- Mainly Dask, which does not play nicely when `functools.partial()` is
applied to a `Delayed` object. See
dask/dask#10707. There is now a workaround.
- Also there is a minor issue with dynamic switching of Covalent
executors, but there is a workaround. See
AgnostiqHQ/covalent#1889

Remaining issue:

Need to be able to parallelize the following code on Dask.

```python
from dask.distributed import Client
from ase.build import bulk
from quacc.recipes.emt.slabs import bulk_to_slabs_flow
from functools import partial

client = Client()

atoms = bulk("Cu")
delayed = bulk_to_slabs_flow(atoms)
result = client.gather(client.compute(delayed))
```

Hint: To monitor, set `"logfile": ""`.

---------

Co-authored-by: deepsource-autofix[bot] <62050782+deepsource-autofix[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Development

Successfully merging this pull request may close these issues.

For multi-step flows, consider having the Jobs as kwargs to allow for flexibility in executor setting
2 participants