diff --git a/open/MLCommons/measurements/ec05e49baa6c-reference-gpu-pytorch-v2.5.1-scc24-base_cu124/README.md b/open/MLCommons/measurements/ec05e49baa6c-reference-gpu-pytorch-v2.5.1-scc24-base_cu124/README.md new file mode 100644 index 0000000..f017152 --- /dev/null +++ b/open/MLCommons/measurements/ec05e49baa6c-reference-gpu-pytorch-v2.5.1-scc24-base_cu124/README.md @@ -0,0 +1,3 @@ +| Model | Scenario | Accuracy | Throughput | Latency (in ms) | +|---------------------|------------|----------------------|--------------|-------------------| +| stable-diffusion-xl | offline | (16.3689, 237.82579) | 0.384 | - | \ No newline at end of file diff --git a/open/MLCommons/measurements/ec05e49baa6c-reference-gpu-pytorch-v2.5.1-scc24-base_cu124/stable-diffusion-xl/offline/README.md b/open/MLCommons/measurements/ec05e49baa6c-reference-gpu-pytorch-v2.5.1-scc24-base_cu124/stable-diffusion-xl/offline/README.md new file mode 100644 index 0000000..212907c --- /dev/null +++ b/open/MLCommons/measurements/ec05e49baa6c-reference-gpu-pytorch-v2.5.1-scc24-base_cu124/stable-diffusion-xl/offline/README.md @@ -0,0 +1,102 @@ +This experiment is generated using the [MLCommons Collective Mind automation framework (CM)](https://github.com/mlcommons/cm4mlops). + +*Check [CM MLPerf docs](https://docs.mlcommons.org/inference) for more details.* + +## Host platform + +* OS version: Linux-6.2.0-39-generic-x86_64-with-glibc2.35 +* CPU version: x86_64 +* Python version: 3.10.12 (main, Sep 11 2024, 15:47:36) [GCC 11.4.0] +* MLCommons CM version: 3.3.4 + +## CM Run Command + +See [CM installation guide](https://docs.mlcommons.org/inference/install/). + +```bash +pip install -U cmind + +cm rm cache -f + +cm pull repo gateoverflow@cm4mlops --checkout=479d496d2b7f243fbf6791f1c154b63129bc9c15 + +cm run script \ + --tags=app,mlperf,inference,generic,_reference,_sdxl,_pytorch,_cuda,_test,_r4.1-dev_default,_float16,_offline \ + --quiet=true \ + --env.CM_MLPERF_MODEL_SDXL_DOWNLOAD_TO_HOST=yes \ + --env.CM_QUIET=yes \ + --env.CM_MLPERF_IMPLEMENTATION=reference \ + --env.CM_MLPERF_MODEL=sdxl \ + --env.CM_MLPERF_RUN_STYLE=test \ + --env.CM_MLPERF_SKIP_SUBMISSION_GENERATION=False \ + --env.CM_DOCKER_PRIVILEGED_MODE=True \ + --env.CM_MLPERF_BACKEND=pytorch \ + --env.CM_MLPERF_SUBMISSION_SYSTEM_TYPE=datacenter \ + --env.CM_MLPERF_CLEAN_ALL=True \ + --env.CM_MLPERF_DEVICE=cuda \ + --env.CM_MLPERF_USE_DOCKER=True \ + --env.CM_MLPERF_MODEL_PRECISION=float16 \ + --env.OUTPUT_BASE_DIR=/home/arjun/scc_gh_action_results \ + --env.CM_MLPERF_LOADGEN_SCENARIO=Offline \ + --env.CM_MLPERF_INFERENCE_SUBMISSION_DIR=/home/arjun/scc_gh_action_submissions \ + --env.CM_MLPERF_INFERENCE_VERSION=4.1-dev \ + --env.CM_RUN_MLPERF_INFERENCE_APP_DEFAULTS=r4.1-dev_default \ + --env.CM_MLPERF_SUBMISSION_DIVISION=open \ + --env.CM_RUN_MLPERF_SUBMISSION_PREPROCESSOR=False \ + --env.CM_MLPERF_SUBMISSION_GENERATION_STYLE=short \ + --env.CM_MLPERF_SUT_NAME_RUN_CONFIG_SUFFIX4=scc24-base \ + --env.CM_DOCKER_IMAGE_NAME=scc24-reference \ + --env.CM_MLPERF_INFERENCE_MIN_QUERY_COUNT=50 \ + --env.CM_MLPERF_LOADGEN_ALL_MODES=yes \ + --env.CM_MLPERF_INFERENCE_SOURCE_VERSION=4.1.23 \ + --env.CM_MLPERF_LAST_RELEASE=v4.1 \ + --env.CM_TMP_CURRENT_PATH=/home/arjun/actions-runner/_work/cm4mlops/cm4mlops \ + --env.CM_TMP_PIP_VERSION_STRING= \ + --env.CM_MODEL=sdxl \ + --env.CM_MLPERF_LOADGEN_COMPLIANCE=no \ + --env.CM_MLPERF_CLEAN_SUBMISSION_DIR=yes \ + --env.CM_RERUN=yes \ + --env.CM_MLPERF_LOADGEN_EXTRA_OPTIONS= \ + --env.CM_MLPERF_LOADGEN_MODE=performance \ + --env.CM_MLPERF_LOADGEN_SCENARIOS,=Offline \ + --env.CM_MLPERF_LOADGEN_MODES,=performance,accuracy \ + --env.CM_OUTPUT_FOLDER_NAME=test_results \ + --env.CM_DOCKER_REUSE_EXISTING_CONTAINER=no \ + --env.CM_DOCKER_DETACHED_MODE=yes \ + --add_deps_recursive.get-mlperf-inference-results-dir.tags=_version.r4_1-dev \ + --add_deps_recursive.get-mlperf-inference-submission-dir.tags=_version.r4_1-dev \ + --add_deps_recursive.mlperf-inference-nvidia-scratch-space.tags=_version.r4_1-dev \ + --add_deps_recursive.submission-checker.tags=_short-run \ + --add_deps_recursive.coco2014-preprocessed.tags=_size.50,_with-sample-ids \ + --add_deps_recursive.coco2014-dataset.tags=_size.50,_with-sample-ids \ + --add_deps_recursive.nvidia-preprocess-data.extra_cache_tags=scc24-base \ + --v=False \ + --print_env=False \ + --print_deps=False \ + --dump_version_info=True \ + --env.OUTPUT_BASE_DIR=/cm-mount/home/arjun/scc_gh_action_results \ + --env.CM_MLPERF_INFERENCE_SUBMISSION_DIR=/cm-mount/home/arjun/scc_gh_action_submissions \ + --env.SDXL_CHECKPOINT_PATH=/home/cmuser/CM/repos/local/cache/6be1f30ecbde4c4e/stable_diffusion_fp16 +``` +*Note that if you want to use the [latest automation recipes](https://docs.mlcommons.org/inference) for MLPerf (CM scripts), + you should simply reload gateoverflow@cm4mlops without checkout and clean CM cache as follows:* + +```bash +cm rm repo gateoverflow@cm4mlops +cm pull repo gateoverflow@cm4mlops +cm rm cache -f + +``` + +## Results + +Platform: ec05e49baa6c-reference-gpu-pytorch-v2.5.1-scc24-base_cu124 + +Model Precision: fp32 + +### Accuracy Results +`CLIP_SCORE`: `16.3689`, Required accuracy for closed division `>= 31.68632` and `<= 31.81332` +`FID_SCORE`: `237.82579`, Required accuracy for closed division `>= 23.01086` and `<= 23.95008` + +### Performance Results +`Samples per second`: `0.383577` diff --git a/open/MLCommons/measurements/ec05e49baa6c-reference-gpu-pytorch-v2.5.1-scc24-base_cu124/stable-diffusion-xl/offline/accuracy_console.out b/open/MLCommons/measurements/ec05e49baa6c-reference-gpu-pytorch-v2.5.1-scc24-base_cu124/stable-diffusion-xl/offline/accuracy_console.out new file mode 100644 index 0000000..c820b03 --- /dev/null +++ b/open/MLCommons/measurements/ec05e49baa6c-reference-gpu-pytorch-v2.5.1-scc24-base_cu124/stable-diffusion-xl/offline/accuracy_console.out @@ -0,0 +1,68 @@ +INFO:main:Namespace(dataset='coco-1024', dataset_path='/home/cmuser/CM/repos/local/cache/0fd744f4046f4dae/install', profile='stable-diffusion-xl-pytorch', scenario='Offline', max_batchsize=1, threads=1, accuracy=True, find_peak_performance=False, backend='pytorch', model_name='stable-diffusion-xl', output='/cm-mount/home/arjun/scc_gh_action_results/test_results/ec05e49baa6c-reference-gpu-pytorch-v2.5.1-scc24-base_cu124/stable-diffusion-xl/offline/accuracy', qps=None, model_path='/home/cmuser/CM/repos/local/cache/6be1f30ecbde4c4e/stable_diffusion_fp16', dtype='fp16', device='cuda', latent_framework='torch', user_conf='/home/cmuser/CM/repos/gateoverflow@cm4mlops/script/generate-mlperf-inference-user-conf/tmp/145d62e32dd94f2c9131c002e17821e5.conf', audit_conf='audit.config', ids_path='/home/cmuser/CM/repos/local/cache/0fd744f4046f4dae/install/sample_ids.txt', time=None, count=10, debug=False, performance_sample_count=5000, max_latency=None, samples_per_query=8) +Keyword arguments {'safety_checker': None} are not expected by StableDiffusionXLPipeline and will be ignored. + Loading pipeline components...: 0%| | 0/7 [00:00