Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

image_root = 'data/image/george_full' #28

Open
zzddwyff opened this issue Oct 31, 2024 · 8 comments
Open

image_root = 'data/image/george_full' #28

zzddwyff opened this issue Oct 31, 2024 · 8 comments

Comments

@zzddwyff
Copy link

where is george_full?

@AndysonYs
Copy link
Collaborator

it is the George sub-dataset here: https://huggingface.co/datasets/TencentARC/StoryStream.

@zzddwyff
Copy link
Author

我就推理需要下载全部的george.zip.gz吗

@zzddwyff
Copy link
Author

I just excute inference do i need to down load all george.zip.gz?

@AndysonYs
Copy link
Collaborator

No you don't. You can just take any of the one image-text pair as input.

@zzddwyff
Copy link
Author

ok let me try try

@zzddwyff
Copy link
Author

CAN you tell what should i change in your code to inference one image? I find out your code adaptly use lots of images

@zzddwyff
Copy link
Author

Traceback (most recent call last):
File "/root/autodl-tmp/SEED-Story/src/inference/gen_george.py", line 213, in
images_gen = adapter.generate(image_embeds=output['img_gen_feat'], num_inference_steps=50)
File "/root/autodl-tmp/SEED-Story/src/models_ipa/adapter_modules.py", line 455, in generate
images = self.sdxl_pipe(
File "/root/autodl-tmp/conda/envs/seed/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/root/autodl-tmp/conda/envs/seed/lib/python3.10/site-packages/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py", line 730, in call
) = self.encode_prompt(
File "/root/autodl-tmp/conda/envs/seed/lib/python3.10/site-packages/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py", line 379, in encode_prompt
prompt_embeds = prompt_embeds.to(dtype=self.text_encoder_2.dtype, device=device)
AttributeError: 'NoneType' object has no attribute 'dtype'

only one image come out

@AndysonYs
Copy link
Collaborator

CAN you tell what should i change in your code to inference one image? I find out your code adaptly use lots of images

Hi. Please see src/inference/gen_george.py. If you want to get a long story given only 1 text-image pair. Then you can change the line 152 to
for j in range(1):
and you should make sure the first line of the val.jsonl file is your input.

If you don't even want a long story, you just need it to generate 1 text-image pair. Then you can change the story_len param to 2 in line 205.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants