-
Notifications
You must be signed in to change notification settings - Fork 821
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add image-to-image task w/ Swin2SR (for super-resolution) #381
Conversation
cc @josephrocca :) I also intend to replicate/showcase the results from their README. |
Example using https://huggingface.co/Xenova/swin2SR-compressed-sr-x4-48: import { pipeline } from '@xenova/transformers';
let url = 'https://huggingface.co/spaces/jjourney1125/swin2sr/resolve/main/testsets/real-inputs/shanghai.jpg';
let upscaler = await pipeline('image-to-image', 'Xenova/swin2SR-compressed-sr-x4-48');
let output = await upscaler(url); |
Awesome!! Seems to take quite a while to load the model - about 40 seconds, not including the download. I'm guessing it's a similar problem to this: microsoft/onnxruntime#11217 since Netron also complains that there are lots of nodes, and takes a very long time to load. The actual inference is about 40 seconds on 8 threads - not bad! WebGPU will get this to a very usable inference time. Exciting! |
This PR adds support for image-to-image translation, starting with the Swin2SR family of models for super-resolution. See here for the list of already-converted models, including 2x and 4x upscalers.
Closes #138
Example usage
Pipeline API
Example code adapted from here.
AutoClasses
Example code adapted from here.
Example output
input (256x256):
output w/ unquantized model (512x512):
note: produces the exact same output as the python implementation (within floating-point precision errors of course).
output w/ quantized model (512x512):
side-by-side (input vs. unquantized output):