Skip to content
This repository has been archived by the owner on Jan 9, 2024. It is now read-only.

Latest commit

 

History

History
132 lines (93 loc) · 4.93 KB

README.md

File metadata and controls

132 lines (93 loc) · 4.93 KB

Deprecation notice

This repo is no longer in use or maintained, the benchmark code now lives in the MLCommons inference repo here.

Stable Diffusion Inference


Welcome to the experimental stable diffusion inference repository. This repository aims to test and evaluate the model as a new benchmark candidate for MLPerf inference .

References

Getting Started

1. Setting Up the Environment

To set up the environment, build and launch the container using the following commands:

docker build . -t sd_mlperf_inference
docker run --rm -it --gpus=all -v {PWD}:/workspace sd_mlperf_inference bash

Note : The subsequent commands are assumed to be executed within this container.

2. Dataset Overview

This repository leverages the coco-2014 validation set for image generation and computation of FID and CLIP scores. COCO (Common Objects in Context) is a diverse dataset instrumental for object detection, segmentation, and captioning tasks. It boasts a substantial validation set with over 40,000 images encompassing more than 200,000 labels.

For benchmarking purposes, we utilize a random subset of {TBD} images and their associated labels, determined by a preset seed of {TBD}. As the focus is on image generation from labels and subsequent score calculation, downloading the entire COCO dataset is unnecessary. The required files for the benchmark, already part of the repository:

  • captions.tsv: Contains processed coco2014 validation annotations with 40,504 prompts and their respective IDs. Essential for image generation and CLIP scoring.
  • captions_5k.tsv: Contains the benchmark annotations. Essential for image generation and CLIP scoring.
  • val2014.npz: Consists of precomputed statistics of the coco2014 validation set. Utilized for FID scoring.

For details on file generation, refer to Appendix A .

3. Image Generation

Execute the main.py script to generate images:

python main.py \
    --model-id xl \         # xl for SD-XL, xlr for SD-XL + Refiner
    --guidance 8.0 \
    --precision fp16 \     # fp16, bf16 and fp32
    --scheduler euler \
    --steps 20 \
    --latent-path latents.pt

For additional execution options:

python main.py --help

4. Compute FID Score

python fid/fid_score.py \
    --batch-size 1 \                 # batch size for the inception network. keep it 1.
    --subset-size 35000 \            # validation subset size, if you want to score the full dataset don't set the argument
    --shuffle-seed 2023 \            # the seed used for random the random subset selection
    ./val2014.npz \                  # ground truth (coco 2014 validation) statistics
    ./output                         # folder with the generated images

For more options:

python fid/fid_score.py --help

5. Compute CLIP Score

python clip/clip_score.py \
    --subset-size 35000 \          # validation subset size, if you want to score the full dataset don't set the argument
    --shuffle-seed 2023 \          # the seed used for random the random subset selection
    --tsv-file captions_5k.tsv \   # captions file
    --image-folder ./output        # Folder with the generated images
    --device cuda                  # Device in which CLIP model is run (cpu, cuda)

For more options:

python clip/clip_score.py --help

Appendix A: Generating Dataset Files

To create the captions.tsv, captions_5k.tsv and val2014.npz files:

  1. Download the coco2014 validation set:
scripts/coco-2014-validation-download.sh
  1. Process the downloaded annotations (provided in JSON format):
python process-coco-annotations.py \
    --input-captions-file {PATH_TO_COCO_ANNOTATIONS_FILE} \                 # Input annotations file
    --output-tsv-file captions.tsv \                                        # Output annotations
    --allow-duplicate-images                                                # Pick one prompt per image
  1. Select a pseduo-random captions subset ():
python subset_generator.py \
    --seed 2023 \                               # Random number generator seed
    --subset-size 5000 \                        # Subset size
    --input-captions-file captions.tsv \        # Input annotations file
    --output-captions-file captions_5k.tsv      # Output annotations
  1. Generate ground truth statistics:
python fid/fid_score.py \
    --batch-size 1 \                      # inception network batch size
    --save-stats {COCO_2014_IMAGES} \     # Input folder with coco images
    val2014.npz                           # Output file