Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change pytorch lightning version #125

Merged
merged 3 commits into from
Jul 11, 2024
Merged

change pytorch lightning version #125

merged 3 commits into from
Jul 11, 2024

Conversation

atong01
Copy link
Owner

@atong01 atong01 commented Jul 11, 2024

What does this PR do?

Fixes #<issue_number>

Before submitting

  • Did you make sure title is self-explanatory and the description concisely explains the PR?
  • Did you make sure your PR does only one thing, instead of bundling different changes together?
  • Did you list all the breaking changes introduced by this pull request?
  • Did you test your PR locally with pytest command?
  • Did you run pre-commit hooks with pre-commit run -a command?

Did you have fun?

Make sure you had fun coding 🙃

@atong01 atong01 merged commit f07c5cd into main Jul 11, 2024
31 checks passed
@atong01 atong01 deleted the alex/runner-req-patch-1 branch July 11, 2024 20:58
ImahnShekhzadeh pushed a commit to ImahnShekhzadeh/conditional-flow-matching that referenced this pull request Jul 29, 2024
* change pytorch lightning version

* fix pip version

* fix pip in code cov
atong01 added a commit that referenced this pull request Aug 21, 2024
* make code changes in `train_cifar10.py` to allow DDP (distributed data parallel)

* add instructions to README on how to run cifar10 image generation code on multiple GPUs

* fix: when running cifar10 image generation on multiple gpus, use `rank` for device setting

* fix: load checkpoint on right device

* fix runner ci requirements (#125)

* change pytorch lightning version

* fix pip version

* fix pip in code cov

* change variable name `world_size` to `total_num_gpus`

* change: do not overwrite batch size flag

* add, refactor: calculate number of epochs based on total number of steps, rewrite training loop to use epochs instead of steps

* fix: add `sampler.set_epoch(epoch)` to training loop to shuffle data in distributed mode

* rename file, update README

* add original CIFAR10 training file

---------

Co-authored-by: Alexander Tong <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant