Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use torch generic workflow for CI, add ssh, artifacts #325

Merged
merged 15 commits into from
May 15, 2024
82 changes: 49 additions & 33 deletions .github/workflows/unit_test_4gpu.yaml
Original file line number Diff line number Diff line change
@@ -1,42 +1,58 @@
name: 4 GPU Unit Test


on:
push:
branches: [ main ]
pull_request:

concurrency:
group: unit-test${{ github.workflow }}-${{ github.ref == 'refs/heads/main' && github.run_number || github.ref }}
cancel-in-progress: true
jobs:
build-test:
uses: pytorch/test-infra/.github/workflows/linux_job.yml@main
with:
runner: linux.g5.12xlarge.nvidia.gpu
gpu-arch-type: cuda
gpu-arch-version: "11.6"
wconstab marked this conversation as resolved.
Show resolved Hide resolved
repository: "pytorch/torchtitan"
script: |
pip3 install --pre torch --index-url https://download.pytorch.org/whl/nightly/cu121
python -m pip install -r requirements.txt
python -m pip install -r dev-requirements.txt
python ./test_runner.py

defaults:
run:
shell: bash -l -eo pipefail {0}

jobs:
unit_tests_4gpu:
runs-on: linux.g5.12xlarge.nvidia.gpu
strategy:
matrix:
python-version: ['3.10']
steps:
- name: Check out repo
uses: actions/checkout@v3
- name: Setup conda env
uses: conda-incubator/setup-miniconda@v2
with:
auto-update-conda: true
miniconda-version: "latest"
activate-environment: test
python-version: ${{ matrix.python-version }}
- name: Update pip
run: python -m pip install --upgrade pip
- name: Install dependencies
run: |
pip3 install --pre torch --index-url https://download.pytorch.org/whl/nightly/cu121
python -m pip install -r requirements.txt
python -m pip install -r dev-requirements.txt
- name: Run test_runner.py
run: python ./test_runner.py
- name: Upload Coverage to Codecov
uses: codecov/codecov-action@v3
# concurrency:
# group: unit-test${{ github.workflow }}-${{ github.ref == 'refs/heads/main' && github.run_number || github.ref }}
# cancel-in-progress: true

# defaults:
wconstab marked this conversation as resolved.
Show resolved Hide resolved
# run:
# shell: bash -l -eo pipefail {0}

# jobs:
# unit_tests_4gpu:
# runs-on: linux.g5.12xlarge.nvidia.gpu
# strategy:
# matrix:
# python-version: ['3.10']
# steps:
# - name: Check out repo
# uses: actions/checkout@v3
# - name: Setup conda env
# uses: conda-incubator/setup-miniconda@v2
# with:
# auto-update-conda: true
# miniconda-version: "latest"
# activate-environment: test
# python-version: ${{ matrix.python-version }}
# - name: Update pip
# run: python -m pip install --upgrade pip
# - name: Install dependencies
# run: |
# pip3 install --pre torch --index-url https://download.pytorch.org/whl/nightly/cu121
# python -m pip install -r requirements.txt
# python -m pip install -r dev-requirements.txt
# - name: Run test_runner.py
# run: python ./test_runner.py
# - name: Upload Coverage to Codecov
# uses: codecov/codecov-action@v3
Loading