Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to Retrieve Intermediate Gradients with CogVideoXPipeline #9698

Open
lovelyczli opened this issue Oct 17, 2024 · 3 comments
Open

Unable to Retrieve Intermediate Gradients with CogVideoXPipeline #9698

lovelyczli opened this issue Oct 17, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@lovelyczli
Copy link

Describe the bug

When generating videos using the CogVideoXPipeline model, we need to access the gradients of intermediate tensors. However, we do not require additional training or parameter updates for the model.

We tried using register_forward_hook to capture the gradients, but this approach failed because the CogVideoXPipeline disables gradient calculations. Specifically, in pipelines/cogvideo/pipeline_cogvideox.py at line 478, gradient tracking is turned off with @torch.no_grad().

How can we resolve this issue and retrieve the gradients without modifying the model’s parameters or performing extra training?

Reproduction

Sample Code
pipe = CogVideoXPipeline.from_pretrained(
"THUDM/CogVideoX-2b",
torch_dtype=torch.float16
)
video = pipe(
prompt=prompt,
num_videos_per_prompt=1,
num_inference_steps=50,
num_frames=49,
guidance_scale=6,
generator=torch.Generator(device="cuda").manual_seed(42),
).frames[0]

Pipeline Code Reference
pipelines/cogvideo/pipeline_cogvideox.py at line 478
@torch.no_grad()
@replace_example_docstring(EXAMPLE_DOC_STRING)
def call(
self,
prompt: Optional[Union[str, List[str]]] = None,
negative_prompt: Optional[Union[str, List[str]]] = None,
height: int = 480,
width: int = 720,

Logs

No response

System Info

Diffusers version: 0.30.3

Who can help?

No response

@lovelyczli lovelyczli added the bug Something isn't working label Oct 17, 2024
@a-r-r-o-w
Copy link
Member

The pipelines should not be used for training. They are only meant for inference purposes, so gradient tracking cannot be done unless you modify the code to suit your needs. Instead, you will have to use each modeling component and write the training loop. You can see an example of training here

@lovelyczli
Copy link
Author

lovelyczli commented Oct 17, 2024

@a-r-r-o-w Thank you for your prompt reply and the training code. I noticed that the provided training code requires independent modules, including T5EncoderModel, CogVideoXTransformer3DModel, and AutoencoderKLCogVideoX.

This approach seems somewhat cumbersome, as our requirement does not involve training or updating model parameters—we only need to access the gradients.

Would simply removing the torch.no_grad() decorator from lines 478-485 in the local pipeline_cogvideox.py resolve the issue efficiently?

Thank you very much!

@a-r-r-o-w
Copy link
Member

Yes, removing the torch.no_grad() would make it possible to access gradients. The models, by default, are in .eval() mode so layers like dropout will not take effect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants