Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with training VTO & Inversion Adapter #39

Open
bertinma opened this issue Sep 18, 2023 · 1 comment
Open

Issue with training VTO & Inversion Adapter #39

bertinma opened this issue Sep 18, 2023 · 1 comment

Comments

@bertinma
Copy link

bertinma commented Sep 18, 2023

Hi,

I'm trying to train all the model with 1024x768 images. I performed to train TPS & EMASC using this shape with some code modifications. Training works well according to metrics and visuals results.

But, it doesn't work at all for Inversion adapter and VTO. Both training produce no loss reduction during training (close to constant using hard smoothing on wandb and very oscillating without smoothing).
Screenshot 2023-09-18 at 18 15 22
I tested also using 512x384 shape and it gives me the same results.
Is it an expected result ?

I'm using default parameters except batch_size = 8 for VTO and batch_size=1 for Inversion adapter on a single A100 GPU. I assume that a greater value than 1 could prevent this training issue, but my HW doesn't allow to use a bigger one 😞
I tried to reduce learning rate but it results to the same issue.

Commands used to train Inversion adapter and VTO :

  • python src/train_inversion_adapter.py --dataset vitonhd --vitonhd_dataroot data/viton-hd/ --output_dir checkpoints/inverter_1024 --gradient_checkpointing --enable_xformers_memory_efficient_attention --use_clip_cloth_features --allow_tf32 --pretrained_model_name_or_path pretrained_models/stable-diffusion-2-inpainting/ --height 1024 --width 768 --train_batch_size 1 --test_batch_size 1

  • python src/train_vto.py --dataset vitonhd --vitonhd_dataroot data/viton-hd/ --output_dir checkpoints/vto_1024 --inversion_adapter_dir checkpoints/inverter_1024/ --gradient_checkpointing --enable_xformers_memory_efficient_attention --use_clip_cloth_features --height 1024 --width 768 --train_batch_size 8 --test_batch_size 8 --allow_tf32

Could you please, help me to resolve this pb ?
Thanks for your clean work btw :)

@ABaldrati
Copy link
Member

Hi @bertinma,

Thank you for your interest in our work!

Regarding experiments at a resolution of 1024x768, I must admit that we did not test the training process procedure at that specific resolution, so I may not be able to provide you with precise guidance in that regard.

At a resolution of 512x384, although the loss behavior appeared to be similar, we did notice improvements in the metrics as the training progressed. Did you notice the same improvements in the metrics during training??

Regarding the batch size, did you try to set the --gradient_accumulation_step parameter such that batch_size * gradient_accumulation_parameters is equal to the desired batch size?

Another point: if you want to achieve optimal performance you need to use the flag --train_inversion_adapter during VTO training.

If you have any more questions or need further insights, please feel free to ask.

Best regards,
Alberto

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants