You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The package currently supports parallelization using Dask. However, there are some important points to note:
No CLI Option for Parallelization: The parallelization feature is not exposed via the command-line interface (CLI).
Intermittent Dask Issues: We are encountering the following error intermittently when using Dask for parallelization:
cat transfer_to_os_1y_U.err
/home/users/acc/.conda/envs/env_cylc/lib/python3.10/site-packages/distributed/node.py:182: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 40343 instead
warnings.warn(
/home/users/acc/.conda/envs/env_cylc/lib/python3.10/site-packages/distributed/client.py:3164: UserWarning: Sending large graph of size 18.33 GiB.
This may cause some slowdown.
Consider scattering data ahead of time and using futures.
warnings.warn(
2024-06-27 13:15:58,818 - distributed.protocol.core - CRITICAL - Failed to Serialize
Traceback (most recent call last):
File "/home/users/acc/.conda/envs/env_cylc/lib/python3.10/site-packages/distributed/protocol/core.py", line 109, in dumps
frames[0] = msgpack.dumps(msg, default=_encode_default, use_bin_type=True)
File "/home/users/acc/.conda/envs/env_cylc/lib/python3.10/site-packages/msgpack/__init__.py", line 36, in packb
return Packer(**kwargs).pack(o)
File "msgpack/_packer.pyx", line 294, in msgpack._cmsgpack.Packer.pack
File "msgpack/_packer.pyx", line 300, in msgpack._cmsgpack.Packer.pack
File "msgpack/_packer.pyx", line 297, in msgpack._cmsgpack.Packer.pack
File "msgpack/_packer.pyx", line 264, in msgpack._cmsgpack.Packer._pack
File "msgpack/_packer.pyx", line 231, in msgpack._cmsgpack.Packer._pack
File "msgpack/_packer.pyx", line 264, in msgpack._cmsgpack.Packer._pack
File "msgpack/_packer.pyx", line 272, in msgpack._cmsgpack.Packer._pack
ValueError: memoryview is too large
2024-06-27 13:15:58,822 - distributed.comm.utils - ERROR - memoryview is too large
This issue only occurs about 30% of the time. The remaining 70% of the runs complete successfully. I have been investigating the memory size, number of jobs, and heartbeat timing, but none of these seem to resolve the issue consistently.
Some users have mentioned that upgrading Dask might help, as referenced here: Dask Issue #7552.
Potential Solutions:
Add a flag to allow users to select the number of workers.
Consider alternative approaches:
Update the Dask version.
Change the parallelization method (e.g., threads, Dask delayed).
Test the upload process with a different object store (e.g., Oracle).
Monitor memory usage more closely during job execution.
Submit multiple smaller SLURM jobs instead of one large job.
The text was updated successfully, but these errors were encountered:
The package currently supports parallelization using Dask. However, there are some important points to note:
No CLI Option for Parallelization: The parallelization feature is not exposed via the command-line interface (CLI).
Intermittent Dask Issues: We are encountering the following error intermittently when using Dask for parallelization:
This issue only occurs about 30% of the time. The remaining 70% of the runs complete successfully. I have been investigating the memory size, number of jobs, and heartbeat timing, but none of these seem to resolve the issue consistently.
Some users have mentioned that upgrading Dask might help, as referenced here: Dask Issue #7552.
Potential Solutions:
Add a flag to allow users to select the number of workers.
Consider alternative approaches:
The text was updated successfully, but these errors were encountered: