Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to quantize a custom Flux model? For example, lightweight models like Flux Lite that have removed some double blocks. #13

Open
kelisiya opened this issue Nov 11, 2024 · 2 comments

Comments

@kelisiya
Copy link

Will there be more general scripts provided in the future to support various custom Flux models?

@lmxyy
Copy link
Collaborator

lmxyy commented Nov 12, 2024

Yeah, the quantization library is at mit-han-lab/deepcompressor. We are also cleaning our LoRA conversion scripts and will release the instructions soon on how to support customized LoRA.

@kelisiya
Copy link
Author

kelisiya commented Nov 12, 2024

Yeah, the quantization library is at mit-han-lab. We are also cleaning our LoRA conversion scripts and will release the instructions soon on how to support customized LoRA.

when I run example.py .
[2024-11-12 09:20:39.140] [info] Initializing QuantizedFluxModel [2024-11-12 09:20:39.384] [info] Loading weights from /data3/home/research/FLUX_train/nunchaku/model/svdq-int4-flux.1-dev.safetensors [2024-11-12 09:20:40.235] [info] Done. 0%| | 0/28 [00:00<?, ?it/s] Traceback (most recent call last): File "/data3/home/research/FLUX_train/nunchaku/example.py", line 10, in <module> image = pipeline("A cat holding a sign that says hello world", num_inference_steps=28, guidance_scale=0).images[0] File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/flux/pipeline_flux.py", line 730, in __call__ noise_pred = self.transformer( File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/diffusers/models/transformers/transformer_flux.py", line 500, in forward encoder_hidden_states, hidden_states = block( File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, **kwargs) File "/data3/home/research/FLUX_train/nunchaku/nunchaku/models/flux.py", line 51, in forward hidden_states = self.m.forward( RuntimeError: CUDA error: no kernel image is available for execution on the device (at /data3/home/research/FLUX_train/nunchaku/src/kernels/awq/gemv_awq.cu:311)

my cuda version:
Copyright (c) 2005-2023 NVIDIA Corporation Built on Wed_Nov_22_10:17:15_PST_2023 Cuda compilation tools, release 12.3, V12.3.107 Build cuda_12.3.r12.3/compiler.33567101_0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants