Use `randn_tensor` to replace `torch.randn` #10535

lmxyy · 2025-01-11T22:21:05Z

What does this PR do?

LTX-Video pipeline uses the original torch.randn to create a random tensor. torch.randn requires generator and latents on the same device, while the wrapped function randn_tensor does not have this issue.

Fixes # (issue)
Fix the running issue when the generator and latents on different devices. For example, if you use CPU's random seed to generate a video on CUDA, the original pipeline will raise an error.

Before submitting

[] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
[] Did you read our philosophy doc (important for complex PRs)?
[] Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
[] Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@yiyixuxu @asomoza

`torch.randn` requires `generator` and `latents` on the same device, while the wrapped function `randn_tensor` does not have this issue.

lmxyy · 2025-01-11T22:25:18Z

A simple script to reproduce the error:

import torch
from diffusers import LTXPipeline
from diffusers.utils import export_to_video

pipe = LTXPipeline.from_pretrained("a-r-r-o-w/LTX-Video-0.9.1-diffusers", torch_dtype=torch.bfloat16)
pipe.to("cuda")

prompt = (
    "A woman with long brown hair and light skin smiles at another woman with long blonde hair. "
    "The woman with brown hair wears a black jacket and has a small, barely noticeable mole on her right cheek. "
    "The camera angle is a close-up, focused on the woman with brown hair's face. The lighting is warm and natural, "
    "likely from the setting sun, casting a soft glow on the scene. The scene appears to be real-life footage"
)
negative_prompt = "worst quality, inconsistent motion, blurry, jittery, distorted"

video = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    width=768,
    height=512,
    num_frames=161,
    decode_timestep=0.03,
    decode_noise_scale=0.025,
    num_inference_steps=50,
    generator=torch.Generator().manual_seed(0),
).frames[0]

export_to_video(video, "output.mp4", fps=24)

Use randn_tensor to replace torch.randn

f233868

`torch.randn` requires `generator` and `latents` on the same device, while the wrapped function `randn_tensor` does not have this issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use `randn_tensor` to replace `torch.randn` #10535

Use `randn_tensor` to replace `torch.randn` #10535

lmxyy commented Jan 11, 2025

lmxyy commented Jan 11, 2025

Use randn_tensor to replace torch.randn #10535

Are you sure you want to change the base?

Use randn_tensor to replace torch.randn #10535

Conversation

lmxyy commented Jan 11, 2025

What does this PR do?

Before submitting

Who can review?

lmxyy commented Jan 11, 2025

Use `randn_tensor` to replace `torch.randn` #10535

Use `randn_tensor` to replace `torch.randn` #10535