Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can anyone explain Table 4 and Table 6 please? #51

Open
himmetozcan opened this issue Sep 9, 2024 · 0 comments
Open

Can anyone explain Table 4 and Table 6 please? #51

himmetozcan opened this issue Sep 9, 2024 · 0 comments

Comments

@himmetozcan
Copy link

Hello. Thanks for the great work. I am trying to understand some results in the paper. Specifically Table 4 and 6. Can you clarify it for me? Thank you!

Table 4:
image

In table 4 they have the same throughput: meaning they can process same number of images in one second. So, what I understand here is for example for DDIM-59 steps, deepcache is using less full inference and plus shallow inference, but still has the same inference time in total. For example, if N_skip = 2, DeepCache might only perform 40 full steps and use shallow network inference (partial inference) for other intermediate 40 steps. And with this setup they have the same image quality according to the Table.
If that is so, what is the point of using deepcache with this setup? I would expect a comparison like with full inference DDIM 50 steps, we use deepcache with N_skip = 2, and so ~2x faster inference but image quality drops about this amount in FID.

Table 6:
Here is the part explains Table 6 from the paper: "Results presented in Table 6 indicate that, with the additional computation of the shallow U-Net, DeepCache improves the 50-step DDIM by 0.32 and the 10-step DDIM by 2.98."

image

I am trying to understand this, specifically, you mention:

"Steps here mean the number of steps that perform full model inference."

Does this mean that both DDIM and DeepCache perform the same number of full U-Net inference steps, but DeepCache adds shallow network inference on top? If so, is that why you don't compare the inference times in this Table because with this case DeepCache would be slower?

Maybe my questions are silly, sorry for that. I am just trying to understand the tradeoff between the image quality and inference speed for stable diffusion models. For example in the diffusers library deepcache is implemented. And their example script just uses it with N_skip = 2, with DDIM as default scheduler and we see an improved speed. But it is not clear for me how much the image quality drops when we use your method with stable diffusion models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant