[SD3]The quality of the images generated by the inference is not as high as on the validation set during fine-tuning? #10475

ytwo-hub · 2025-01-06T14:52:57Z

Describe the bug

Why is the quality of the graphs I generate with StableDiffusion3Pipeline not as good as the quality of the images in the validation set in the log generated when using dreambooth_lora for fine tuning?
Maybe I need some other plugin or parameter setting to maintain the same image quality as the validation set?

Reproduction

# Here is my inference code:

import torch
from diffusers import StableDiffusion3Pipeline

pipe = StableDiffusion3Pipeline.from_pretrained('./diffusers/stabilityai/stable-diffusion-3-medium-diffusers', torch_dtype=torch.float16).to('cuda')
pipe.load_lora_weights("./my_path/pytorch_lora_weights.safetensors", adapter_name="test_lora")
img = pipe(
    "my prompt...",
    generator=torch.manual_seed(1),
    num_inference_steps=40,
    guidance_scale=6
).images[0].save('/root/my_img.png')

Logs

No response

System Info

Diffuser Version: stable-diffusion-3-medium
CUDA Version: 12.4
GPU: NVIDIA A800 80GB

Who can help?

No response

The text was updated successfully, but these errors were encountered:

bhomik749 · 2025-01-08T10:13:55Z

Is it resolved?

if not, could you provide a little detail on what dataset you fine-tuned and what is the prompt you are providing for inferencing?

ytwo-hub · 2025-01-08T10:29:29Z

Is it resolved?

if not, could you provide a little detail on what dataset you fine-tuned and what is the prompt you are providing for inferencing?

My dataset is photos of vehicles of different makes. The prompt I provided in the inference phase and the “validation_prompt” used in the fine tuning with dreambooth_lora are the same prompt, but the quality of the resulting image is much worse than the one shown by the “validation” in the tensorboard during training.
For example, my training instruction is similar to:

accelerate launch train_dreambooth_lora_sd3.py \
  --pretrained_model_name_or_path=$MODEL_NAME  \
  --instance_data_dir=$INSTANCE_DIR \
  --output_dir=$OUTPUT_DIR \
  --mixed_precision="fp16" \
  --instance_prompt="a photo of Audi A6L car, in a realistic environment " \
  --resolution=512 \
  --train_batch_size=1 \
  --gradient_accumulation_steps=4 \
  --learning_rate=4e-4 \
  --report_to="wandb" \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --max_train_steps=1500 \
  --validation_prompt="a photo of white Audi A6L car on the road" \
  --validation_epochs=30 \
  --seed="0"

bhomik749 · 2025-01-09T08:05:04Z

@ytwo-hub
I can see that you haven't provided any unique identifier for your instance prompt in order for the dremabooth technique to perform at all. Try adding special tokens in this way;
instance prompt = "a photo of Audi A6L car, in a realistic environment"

Let me know if it helps.

ytwo-hub · 2025-01-09T14:00:26Z

@ytwo-hub I can see that you haven't provided any unique identifier for your instance prompt in order for the dremabooth technique to perform at all. Try adding special tokens in this way; instance prompt = "a photo of Audi A6L car, in a realistic environment"

Let me know if it helps.

Your suggestion works, especially if it works better after I add the "negativate prompt". Thank you very much!

bhomik749 · 2025-01-10T05:22:23Z

Glad to help you out.

ytwo-hub added the bug Something isn't working label Jan 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SD3]The quality of the images generated by the inference is not as high as on the validation set during fine-tuning? #10475

[SD3]The quality of the images generated by the inference is not as high as on the validation set during fine-tuning? #10475

ytwo-hub commented Jan 6, 2025

bhomik749 commented Jan 8, 2025

ytwo-hub commented Jan 8, 2025

bhomik749 commented Jan 9, 2025

ytwo-hub commented Jan 9, 2025

bhomik749 commented Jan 10, 2025

[SD3]The quality of the images generated by the inference is not as high as on the validation set during fine-tuning? #10475

[SD3]The quality of the images generated by the inference is not as high as on the validation set during fine-tuning? #10475

Comments

ytwo-hub commented Jan 6, 2025

Describe the bug

Reproduction

Logs

System Info

Who can help?

bhomik749 commented Jan 8, 2025

ytwo-hub commented Jan 8, 2025

bhomik749 commented Jan 9, 2025

ytwo-hub commented Jan 9, 2025

bhomik749 commented Jan 10, 2025