Support image prompt #29

wentao-uw · 2024-08-22T05:52:25Z

Use case: replace logo or text in the video. Input: old logo, video (with old logo), new logo; output: video with new logo

rentainhe · 2024-08-23T02:27:02Z

Hi @wentao-uw , it's a good idea to support referring detection or segmentation based on image prompt, but Grounding DINO can only support text prompts now, for referring detection or detection based on visual prompt you can try to combine SAM 2 with our T-Rex2 model.

And you can support this pipeline with video-editing model for additional editing on videos

rentainhe · 2024-08-26T05:47:50Z

Hi @wentao-uw , for image prompt detection and segmentation, you can also try DINOv for this function. It can track or detect anything by visual prompt.

rentainhe added the discussion label Sep 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support image prompt #29

Support image prompt #29

wentao-uw commented Aug 22, 2024

rentainhe commented Aug 23, 2024 •

edited

Loading

rentainhe commented Aug 26, 2024

Support image prompt #29

Support image prompt #29

Comments

wentao-uw commented Aug 22, 2024

rentainhe commented Aug 23, 2024 • edited Loading

rentainhe commented Aug 26, 2024

rentainhe commented Aug 23, 2024 •

edited

Loading