[NeurIPS 2023 Spotlight] Official Implementation of paper Real-World Image Variation by Aligning Diffusion Inversion Chain [PDF] [ arXiv ] [ Project Page ]
- [20231028] Code release for the image variations and text-to-image
- [20231030] Code release for ControlNet inference, image editing
- [20231031] Code release for other applications (like +inpainting), user manual
- [202311xx] Code release for SDXL, and other possible applications
We provide several examples with five applications: variations, T2I, editing, inpainting, and ControlNet.
Please raise an issue/PR if you have problems in env setting.
conda create -n rival python=3.9.16
conda activate rival
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt
conda install xformers -c xformers
All applications have a config file for inference. The following shows a brief explanation of some key parameters.
{
"self_attn":
{
"atten_frames": 2,
"t_align": 600 # [0-1000], smaller means closer to the original image (semantically).
},
"inference":
{
"invert_step": 50,
"ddim_step": 50,
"cfg": 7,
"is_null_prompt": true, # whether use the empty prompt "" in inversion.
"t_early": 600 # [0-1000], smaller means closer to the original image (low-level color distribution).
}
}
In test python file (e.g., rival/test_variation_sdv1.py):
--inf_config
: Inference config file. default="configs/rival_variation.json"--img_config
: Data config file. default="assets/images/configs_variation.json"--inner_round
: How many images do you want to generate per reference. default=1--exp_folder
: Output folder. default="out/variation_exps"--pretrained_model_path
: SD model path. default="runwayml/stable-diffusion-v1-5"--is_half
: Whether use fp16. default=False--is_editing
: If set True, we do not permute inverted latent. default=False--editing_early_steps
: For t > step, do normal inference in self-attention. default=1000
With a reference image, RIVAL generates images with the same semantic contents and style, without any optimization.
bash scripts/rival_variation_test.sh
Users can modify the editing_early_steps
in this script to control the editing strength.
bash scripts/rival_editing_test.sh
With RIVAL, we can customize both object concept and style concept that is hard be describe.
bash scripts/rival_dreambooth_test.sh
Please note that its application scope is indeed limited (as shown in the paper, the example can only come from itself).
bash scripts/rival_inpainting_test.sh
bash scripts/rival_t2i_test.sh
The config example is given in assets/images/configs_controlnet.json
. You may enable more modalities by editing the Python script.
bash scripts/rival_controlnet_test.sh
@article{zhang2023realworld,
title={Real-World Image Variation by Aligning Diffusion Inversion Chain},
author={Yuechen Zhang and Jinbo Xing and Eric Lo and Jiaya Jia},
journal={arXiv preprint arXiv:2305.18729},
year={2023},
}