Skip to content
This repository has been archived by the owner on Dec 11, 2024. It is now read-only.

Commit

Permalink
Merge dev to main (#53)
Browse files Browse the repository at this point in the history
  • Loading branch information
arjunsuresh authored Dec 10, 2024
2 parents b37fda0 + d119eb3 commit 61bb8e1
Show file tree
Hide file tree
Showing 25 changed files with 164 additions and 66 deletions.
3 changes: 2 additions & 1 deletion .github/workflows/build_wheel.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ on:
push:
branches:
- main
- mlperf-inference
- dev
paths:
- VERSION
- setup.py
Expand All @@ -31,6 +31,7 @@ jobs:
with:
fetch-depth: 2
ssh-key: ${{ secrets.DEPLOY_KEY }}
ref: ${{ github.ref_name }}

# Step 2: Set up Python
- uses: actions/setup-python@v3
Expand Down
3 changes: 1 addition & 2 deletions .github/workflows/publish-docs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,7 @@ on:
push:
branches:
- main
- docs
- mlperf-inference
- dev
paths:
- docs/**
- mkdocs.yml
Expand Down
8 changes: 4 additions & 4 deletions .github/workflows/test-mlperf-inference-abtf-poc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ jobs:
python-version: [ "3.8", "3.12" ]
backend: [ "pytorch" ]
implementation: [ "python" ]
docker: [ "", " --docker --docker_it=no --docker_cm_repo=gateoverflow@cm4mlops --docker_dt=yes" ]
docker: [ "", " --docker --docker_it=no --docker_cm_repo=mlcommons@mlperf-automations --docker_cm_repo_branch=dev --docker_dt=yes" ]
extra-args: [ "--adr.compiler.tags=gcc", "--env.CM_MLPERF_LOADGEN_BUILD_FROM_SRC=off" ]
exclude:
- os: ubuntu-24.04
Expand All @@ -30,16 +30,16 @@ jobs:
- os: windows-latest
extra-args: "--adr.compiler.tags=gcc"
- os: windows-latest
docker: " --docker --docker_it=no --docker_cm_repo=gateoverflow@cm4mlops --docker_dt=yes"
docker: " --docker --docker_it=no --docker_cm_repo=mlcommons@mlperf-automations --docker_cm_repo_branch=dev --docker_dt=yes"
# windows docker image is not supported in CM yet
- os: macos-latest
python-version: "3.8"
- os: macos-13
python-version: "3.8"
- os: macos-latest
docker: " --docker --docker_it=no --docker_cm_repo=gateoverflow@cm4mlops --docker_dt=yes"
docker: " --docker --docker_it=no --docker_cm_repo=mlcommons@mlperf-automations --docker_cm_repo_branch=dev --docker_dt=yes"
- os: macos-13
docker: " --docker --docker_it=no --docker_cm_repo=gateoverflow@cm4mlops --docker_dt=yes"
docker: " --docker --docker_it=no --docker_cm_repo=mlcommons@mlperf-automations --docker_cm_repo_branch=dev --docker_dt=yes"

steps:
- uses: actions/checkout@v3
Expand Down
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,12 @@ This repository contains the automations and scripts used to run MLPerf benchmar

## Collective Mind (CM)

**Collective Mind (CM)** is a Python package with a CLI and API designed for creating and managing automations. Two key automations developed using CM are **Script** and **Cache**, which streamline machine learning (ML) workflows, including managing Docker runs. Both Script and Cache automations are extended as part of this repository.

The CM scripts housed in this repository consist of hundreds of modular Python-wrapped scripts accompanied by `yaml` metadata, enabling the creation of robust and flexible ML workflows.

- **CM Scripts Documentation**: [https://docs.mlcommons.org/cm4mlops/](https://docs.mlcommons.org/cm4mlops/)
**CM (Collective Mind)** is a Python package with a CLI and API designed to create and manage automations. Two key automations developed using CM are **Script** and **Cache**, which streamline ML workflows, including managing Docker runs.
The CM Python package was developed by Grigori Fursin. The Script and Cache automations are part of the cm4mlops repository, created by Grigori Fursin and Arjun Suresh and sponsored by OctoML, cKnowledge, cTuning and MLCommons.
The CM scripts, also housed in the cm4mlops repository, are created and maintained by Arjun Suresh and Anandhu Sooraj, and Grigori Fursin with the help of the MLCommons community.

**CM CLI:** https://docs.mlcommons.org/ck/specs/cm-cli/
**Documentation site for MLPerf Inference:** https://docs.mlcommons.org/inference/

## License

Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
0.6.4
0.6.11
2 changes: 1 addition & 1 deletion git_commit_hash.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
8768d67e1f9187005bdb3c7f325e42998dc7fd8a
f5e04069c8d7395be34f94fa8a94edc6c317b58e
4 changes: 2 additions & 2 deletions mkdocs.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
site_name: CM Script Automation Documentation
repo_url: https://github.com/mlcommons/cm4mlops
site_name: MLPerf Automation Documentation
repo_url: https://github.com/mlcommons/mlperf-automations
theme:
name: material
logo: img/logo_v2.svg
Expand Down
46 changes: 33 additions & 13 deletions script/app-mlperf-inference-mlcommons-python/_cm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -482,13 +482,12 @@ deps:
## RGAT
- tags: get,ml-model,rgat
names:
- ml-model
- rgat-model
enable_if_env:
CM_MODEL:
- rgat
skip_if_env:
RGAT_CHECKPOINT_PATH:
CM_ML_MODEL_RGAT_CHECKPOINT_PATH:
- 'on'

########################################################################
Expand Down Expand Up @@ -620,6 +619,9 @@ deps:
enable_if_env:
CM_MODEL:
- rgat
skip_if_env:
CM_DATASET_IGBH_PATH:
- "on"

########################################################################
# Install MLPerf inference dependencies
Expand Down Expand Up @@ -1224,27 +1226,45 @@ variations:
group: models
env:
CM_MODEL: rgat
adr:
pytorch:
version: 2.1.0
deps:
- tags: get,generic-python-lib,_package.colorama
- tags: get,generic-python-lib,_package.tqdm
- tags: get,generic-python-lib,_package.requests
- tags: get,generic-python-lib,_package.torchdata
- tags: get,generic-python-lib,_package.torch-geometric
- tags: get,generic-python-lib,_package.torch-scatter
- tags: get,generic-python-lib,_package.torch-sparse
version: 0.7.0
- tags: get,generic-python-lib,_package.torchvision
version: 0.16.0
- tags: get,generic-python-lib,_package.pybind11
- tags: get,generic-python-lib,_package.PyYAML
- tags: get,generic-python-lib,_package.numpy
version: 1.26.4
- tags: get,generic-python-lib,_package.pydantic
- tags: get,generic-python-lib,_package.igb,_url.git+https://github.com/IllinoisGraphBenchmark/IGB-Datasets.git
- tags: get,generic-python-lib,_package.dgl,_find_links_url.https://data.dgl.ai/wheels/torch-2.1/repo.html
enable_if_env:
CM_MLPERF_DEVICE:
- cpu

rgat,cuda:
deps:
- tags: get,generic-python-lib,_package.dgl,_find_links_url.https://data.dgl.ai/wheels/torch-2.1/cu121/repo.html
enable_if_env:
CM_MLPERF_DEVICE:
- gpu

- tags: get,generic-python-lib,_package.torch-scatter
- tags: get,generic-python-lib,_package.torch-sparse
- tags: get,generic-python-lib,_package.torch-geometric
env:
CM_GENERIC_PYTHON_PIP_EXTRA_FIND_LINKS_URL: "https://data.pyg.org/whl/torch-<<<CM_TORCH_VERSION>>>.html"

rgat,cpu:
deps:
- tags: get,generic-python-lib,_package.torch-geometric
env:
CM_GENERIC_PYTHON_PIP_EXTRA_FIND_LINKS_URL: "https://data.pyg.org/whl/torch-<<<CM_TORCH_VERSION>>>+cpu.html"
- tags: get,generic-python-lib,_package.torch-scatter
env:
CM_GENERIC_PYTHON_PIP_EXTRA_FIND_LINKS_URL: "https://data.pyg.org/whl/torch-<<<CM_TORCH_VERSION>>>+cpu.html"
- tags: get,generic-python-lib,_package.torch-sparse
env:
CM_GENERIC_PYTHON_PIP_EXTRA_FIND_LINKS_URL: "https://data.pyg.org/whl/torch-<<<CM_TORCH_VERSION>>>+cpu.html"
- tags: get,generic-python-lib,_package.dgl,_find_links_url.https://data.dgl.ai/wheels/torch-2.1/repo.html

# Target devices
cpu:
Expand Down
17 changes: 10 additions & 7 deletions script/app-mlperf-inference-mlcommons-python/customize.py
Original file line number Diff line number Diff line change
Expand Up @@ -115,10 +115,12 @@ def preprocess(i):
scenario_extra_options = ''

NUM_THREADS = env['CM_NUM_THREADS']
if int(NUM_THREADS) > 2 and env['CM_MLPERF_DEVICE'] == "gpu":
if int(
NUM_THREADS) > 2 and env['CM_MLPERF_DEVICE'] == "gpu" and env['CM_MODEL'] != "rgat":
NUM_THREADS = "2" # Don't use more than 2 threads when run on GPU

if env['CM_MODEL'] in ['resnet50', 'retinanet', 'stable-diffusion-xl']:
if env['CM_MODEL'] in ['resnet50', 'retinanet',
'stable-diffusion-xl', 'rgat']:
scenario_extra_options += " --threads " + NUM_THREADS

ml_model_name = env['CM_MODEL']
Expand Down Expand Up @@ -485,15 +487,16 @@ def get_run_cmd_reference(
# have to add the condition for running in debug mode or real run mode
cmd = env['CM_PYTHON_BIN_WITH_PATH'] + " main.py " \
" --scenario " + env['CM_MLPERF_LOADGEN_SCENARIO'] + \
" --dataset-path " + env['CM_IGBH_DATASET_PATH'] + \
" --device " + device.replace("cuda", "cuda:0") + \
" --dataset-path " + env['CM_DATASET_IGBH_PATH'] + \
" --device " + device.replace("cuda", "gpu") + \
env['CM_MLPERF_LOADGEN_EXTRA_OPTIONS'] + \
scenario_extra_options + mode_extra_options + \
" --output " + env['CM_MLPERF_OUTPUT_DIR'] + \
' --dtype ' + dtype_rgat + \
" --model-path " + env['RGAT_CHECKPOINT_PATH'] + \
" --mlperf_conf " + \
os.path.join(env['CM_MLPERF_INFERENCE_SOURCE'], "mlperf.conf")
" --model-path " + env['CM_ML_MODEL_RGAT_CHECKPOINT_PATH']

if env.get('CM_ACTIVATE_RGAT_IN_MEMORY', '') == "yes":
cmd += " --in-memory "

if env.get('CM_NETWORK_LOADGEN', '') in ["lon", "sut"]:
cmd = cmd + " " + "--network " + env['CM_NETWORK_LOADGEN']
Expand Down
35 changes: 35 additions & 0 deletions script/app-mlperf-inference/_cm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -767,6 +767,20 @@ variations:
env:
CM_MODEL:
rgat
posthook_deps:
- enable_if_env:
CM_MLPERF_LOADGEN_MODE:
- accuracy
- all
CM_MLPERF_ACCURACY_RESULTS_DIR:
- 'on'
skip_if_env:
CM_MLPERF_IMPLEMENTATION:
- nvidia
names:
- mlperf-accuracy-script
- 3d-unet-accuracy-script
tags: run,accuracy,mlperf,_igbh

sdxl:
group:
Expand Down Expand Up @@ -1645,6 +1659,25 @@ variations:
CM_ENV_NVMITTEN_DOCKER_WHEEL_PATH: '/opt/nvmitten-0.1.3b0-cp38-cp38-linux_x86_64.whl'
CM_MLPERF_INFERENCE_VERSION: '4.1'

r5.0-dev_default:
group:
reproducibility
add_deps_recursive:
nvidia-inference-common-code:
version: r4.1
tags: _mlcommons
nvidia-inference-server:
version: r4.1
tags: _mlcommons
intel-harness:
tags: _v4.1
default_env:
CM_SKIP_SYS_UTILS: 'yes'
CM_REGENERATE_MEASURE_FILES: 'yes'
env:
CM_ENV_NVMITTEN_DOCKER_WHEEL_PATH: '/opt/nvmitten-0.1.3b0-cp38-cp38-linux_x86_64.whl'


invalid_variation_combinations:
-
- retinanet
Expand Down Expand Up @@ -1768,6 +1801,8 @@ docker:
- "${{ CM_NVIDIA_LLAMA_DATASET_FILE_PATH }}:${{ CM_NVIDIA_LLAMA_DATASET_FILE_PATH }}"
- "${{ SDXL_CHECKPOINT_PATH }}:${{ SDXL_CHECKPOINT_PATH }}"
- "${{ CM_DATASET_KITS19_PREPROCESSED_PATH }}:${{ CM_DATASET_KITS19_PREPROCESSED_PATH }}"
- "${{ CM_DATASET_IGBH_PATH }}:${{ CM_DATASET_IGBH_PATH }}"
- "${{ CM_ML_MODEL_RGAT_CHECKPOINT_PATH }}:${{ CM_ML_MODEL_RGAT_CHECKPOINT_PATH }}"
skip_run_cmd: 'no'
shm_size: '32gb'
interactive: True
Expand Down
1 change: 1 addition & 0 deletions script/get-cudnn/_cm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ default_env:

deps:
- tags: detect,os
- tags: detect,sudo
- names:
- cuda
skip_if_env:
Expand Down
14 changes: 9 additions & 5 deletions script/get-dataset-mlperf-inference-igbh/_cm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,8 @@ tags:
- inference
uid: 824e61316c074253
new_env_keys:
- CM_IGBH_DATASET_PATH
- CM_DATASET_IGBH_PATH
- CM_DATASET_IGBH_SIZE
input_mapping:
out_path: CM_IGBH_DATASET_OUT_PATH
deps:
Expand All @@ -21,6 +22,9 @@ deps:
- tags: get,python
names:
- get-python
- tags: get,generic-python-lib,_package.igb,_url.git+https://github.com/anandhu-eng/IGB-Datasets.git
- tags: get,generic-python-lib,_package.colorama
- tags: get,generic-python-lib,_package.tqdm
prehook_deps:
#paper
- env:
Expand Down Expand Up @@ -359,13 +363,13 @@ variations:
default: true
group: dataset-type
env:
CM_IGBH_DATASET_TYPE: debug
CM_IGBH_DATASET_SIZE: tiny
CM_DATASET_IGBH_TYPE: debug
CM_DATASET_IGBH_SIZE: tiny
full:
group: dataset-type
env:
CM_IGBH_DATASET_TYPE: full
CM_IGBH_DATASET_SIZE: full
CM_DATASET_IGBH_TYPE: debug
CM_DATASET_IGBH_SIZE: tiny
glt:
env:
CM_IGBH_GRAPH_COMPRESS: yes
Expand Down
10 changes: 5 additions & 5 deletions script/get-dataset-mlperf-inference-igbh/customize.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,18 +27,18 @@ def preprocess(i):
x_sep = " && "

# download the model
if env['CM_IGBH_DATASET_TYPE'] == "debug":
if env['CM_DATASET_IGBH_TYPE'] == "debug":
run_cmd += x_sep + env['CM_PYTHON_BIN_WITH_PATH'] + \
f" tools/download_igbh_test.py --target-path {download_loc} "

# split seeds
run_cmd += x_sep + \
f"{env['CM_PYTHON_BIN_WITH_PATH']} tools/split_seeds.py --path {download_loc} --dataset_size {env['CM_IGBH_DATASET_SIZE']}"
f"{env['CM_PYTHON_BIN_WITH_PATH']} tools/split_seeds.py --path {download_loc} --dataset_size {env['CM_DATASET_IGBH_SIZE']}"

# compress graph(for glt implementation)
if env.get('CM_IGBH_GRAPH_COMPRESS', '') == "yes":
run_cmd += x_sep + \
f"{env['CM_PYTHON_BIN_WITH_PATH']} tools/compress_graph.py --path {download_loc} --dataset_size {env['CM_IGBH_DATASET_SIZE']} --layout {env['CM_IGBH_GRAPH_COMPRESS_LAYOUT']}"
f"{env['CM_PYTHON_BIN_WITH_PATH']} tools/compress_graph.py --path {download_loc} --dataset_size {env['CM_DATASET_IGBH_SIZE']} --layout {env['CM_IGBH_GRAPH_COMPRESS_LAYOUT']}"

env['CM_RUN_CMD'] = run_cmd

Expand All @@ -49,10 +49,10 @@ def postprocess(i):

env = i['env']

env['CM_IGBH_DATASET_PATH'] = env.get(
env['CM_DATASET_IGBH_PATH'] = env.get(
'CM_IGBH_DATASET_OUT_PATH', os.getcwd())

print(
f"Path to the IGBH dataset: {os.path.join(env['CM_IGBH_DATASET_PATH'], env['CM_IGBH_DATASET_SIZE'])}")
f"Path to the IGBH dataset: {os.path.join(env['CM_DATASET_IGBH_PATH'], env['CM_DATASET_IGBH_SIZE'])}")

return {'return': 0}
2 changes: 1 addition & 1 deletion script/get-ml-model-rgat/_cm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ input_mapping:
to: CM_DOWNLOAD_PATH
new_env_keys:
- CM_ML_MODEL_*
- RGAT_CHECKPOINT_PATH
- CM_ML_MODEL_RGAT_CHECKPOINT_PATH
prehook_deps:
- enable_if_env:
CM_DOWNLOAD_TOOL:
Expand Down
8 changes: 4 additions & 4 deletions script/get-ml-model-rgat/customize.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,12 +19,12 @@ def postprocess(i):

env = i['env']

if env.get('RGAT_CHECKPOINT_PATH', '') == '':
env['RGAT_CHECKPOINT_PATH'] = os.path.join(
if env.get('CM_ML_MODEL_RGAT_CHECKPOINT_PATH', '') == '':
env['CM_ML_MODEL_RGAT_CHECKPOINT_PATH'] = os.path.join(
env['CM_ML_MODEL_PATH'], "RGAT.pt")
elif env.get('CM_ML_MODEL_PATH', '') == '':
env['CM_ML_MODEL_PATH'] = env['RGAT_CHECKPOINT_PATH']
env['CM_ML_MODEL_PATH'] = env['CM_ML_MODEL_RGAT_CHECKPOINT_PATH']

env['CM_GET_DEPENDENT_CACHED_PATH'] = env['RGAT_CHECKPOINT_PATH']
env['CM_GET_DEPENDENT_CACHED_PATH'] = env['CM_ML_MODEL_RGAT_CHECKPOINT_PATH']

return {'return': 0}
7 changes: 3 additions & 4 deletions script/get-mlperf-inference-loadgen/run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -25,14 +25,14 @@ cmake \
-DCMAKE_INSTALL_PREFIX="${INSTALL_DIR}" \
"${CM_MLPERF_INFERENCE_SOURCE}/loadgen" \
-DPYTHON_EXECUTABLE:FILEPATH="${CM_PYTHON_BIN_WITH_PATH}" -B .
if [ ${?} -ne 0 ]; then exit $?; fi
test $? -eq 0 || exit $?

echo "******************************************************"
CM_MAKE_CORES=${CM_MAKE_CORES:-${CM_HOST_CPU_TOTAL_CORES}}
CM_MAKE_CORES=${CM_MAKE_CORES:-2}

cmake --build . --target install -j "${CM_MAKE_CORES}"
if [ ${?} -ne 0 ]; then exit $?; fi
test $? -eq 0 || exit $?

# Clean build directory (too large)
cd "${CUR_DIR}"
Expand All @@ -43,8 +43,7 @@ fi

cd "${CM_MLPERF_INFERENCE_SOURCE}/loadgen"
${CM_PYTHON_BIN_WITH_PATH} -m pip install . --target="${MLPERF_INFERENCE_PYTHON_SITE_BASE}"

if [ ${?} -ne 0 ]; then exit $?; fi
test $? -eq 0 || exit $?

# Clean the built wheel
#find . -name 'mlcommons_loadgen*.whl' | xargs rm
Expand Down
Loading

0 comments on commit 61bb8e1

Please sign in to comment.