Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SD3]The quality of the images generated by the inference is not as high as on the validation set during fine-tuning? #10475

Open
ytwo-hub opened this issue Jan 6, 2025 · 5 comments
Labels
bug Something isn't working

Comments

@ytwo-hub
Copy link

ytwo-hub commented Jan 6, 2025

Describe the bug

Why is the quality of the graphs I generate with StableDiffusion3Pipeline not as good as the quality of the images in the validation set in the log generated when using dreambooth_lora for fine tuning?
Maybe I need some other plugin or parameter setting to maintain the same image quality as the validation set?

Reproduction

# Here is my inference code:

import torch
from diffusers import StableDiffusion3Pipeline

pipe = StableDiffusion3Pipeline.from_pretrained('./diffusers/stabilityai/stable-diffusion-3-medium-diffusers', torch_dtype=torch.float16).to('cuda')
pipe.load_lora_weights("./my_path/pytorch_lora_weights.safetensors", adapter_name="test_lora")
img = pipe(
    "my prompt...",
    generator=torch.manual_seed(1),
    num_inference_steps=40,
    guidance_scale=6
).images[0].save('/root/my_img.png')

Logs

No response

System Info

Diffuser Version: stable-diffusion-3-medium
CUDA Version: 12.4
GPU: NVIDIA A800 80GB

Who can help?

No response

@ytwo-hub ytwo-hub added the bug Something isn't working label Jan 6, 2025
@bhomik749
Copy link

Is it resolved?

if not, could you provide a little detail on what dataset you fine-tuned and what is the prompt you are providing for inferencing?

@ytwo-hub
Copy link
Author

ytwo-hub commented Jan 8, 2025

Is it resolved?

if not, could you provide a little detail on what dataset you fine-tuned and what is the prompt you are providing for inferencing?

My dataset is photos of vehicles of different makes. The prompt I provided in the inference phase and the “validation_prompt” used in the fine tuning with dreambooth_lora are the same prompt, but the quality of the resulting image is much worse than the one shown by the “validation” in the tensorboard during training.
For example, my training instruction is similar to:

accelerate launch train_dreambooth_lora_sd3.py \
  --pretrained_model_name_or_path=$MODEL_NAME  \
  --instance_data_dir=$INSTANCE_DIR \
  --output_dir=$OUTPUT_DIR \
  --mixed_precision="fp16" \
  --instance_prompt="a photo of Audi A6L car, in a realistic environment " \
  --resolution=512 \
  --train_batch_size=1 \
  --gradient_accumulation_steps=4 \
  --learning_rate=4e-4 \
  --report_to="wandb" \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --max_train_steps=1500 \
  --validation_prompt="a photo of white Audi A6L car on the road" \
  --validation_epochs=30 \
  --seed="0"

@bhomik749
Copy link

@ytwo-hub
I can see that you haven't provided any unique identifier for your instance prompt in order for the dremabooth technique to perform at all. Try adding special tokens in this way;
instance prompt = "a photo of Audi A6L car, in a realistic environment"

Let me know if it helps.

@ytwo-hub
Copy link
Author

ytwo-hub commented Jan 9, 2025

@ytwo-hub I can see that you haven't provided any unique identifier for your instance prompt in order for the dremabooth technique to perform at all. Try adding special tokens in this way; instance prompt = "a photo of Audi A6L car, in a realistic environment"

Let me know if it helps.

Your suggestion works, especially if it works better after I add the "negativate prompt". Thank you very much!

@bhomik749
Copy link

Glad to help you out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants