You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 20.55it/s]
2024-11-13 06:31:53.00986: FMO INFO: Finding the optimal batch size with offloading set to False
2024-11-13 06:31:54.00091: FMO INFO: Test Batch Size: 32
2024-11-13 06:32:30.00441: FMO INFO: Test Batch Size: 16
2024-11-13 06:32:48.00488: FMO INFO: Finished finding the optimal batch size: batch size: 16
Collect Activation Statistics: 2%|██▏ | 1/64 [00:39<41:05, 39.14s/it]
2024-11-13 06:33:28,997.00997: fmo.main ERROR: OOM: The process cannot run on the current device due to insufficient memory. Refer to the FAQ in README.md for handling out-of-memory errors.
However, when specified it manually (with a lower batch size, 8),
However, when specified it manually (with a lower batch size, 8),
works without any issues. I think choosing the batch size conservatively may help this issue.
The text was updated successfully, but these errors were encountered: