This is a Truss for DeepFloyd-IF. DeepFloyd-IF is a pixel-based text-to-image triple-cascaded diffusion model that can generate pictures and sets a new state-of-the-art for photorealism and language understanding. The result is a highly efficient model that outperforms current state-of-the-art models, achieving a zero-shot FID-30K score of 6.66 on the COCO dataset.
Model details:
- Developed by: DeepFloyd, StabilityAI
- Model type: pixel-based text-to-image cascaded diffusion model
- Cascade Stage: I
- Num Parameters: 4.3B
- Language(s): primarily English and, to a lesser extent, other Romance languages
- License: DeepFloyd IF License Agreement
- Model Description: DeepFloyd-IF is modular composed of frozen text mode and three pixel cascaded diffusion modules, each designed to generate images of increasing resolution: 64x64, 256x256, and 1024x1024. All stages of the model utilize a frozen text encoder based on the T5 transformer to extract text embeddings, which are then fed into a UNet architecture enhanced with cross-attention and attention-pooling
Before deploying this model, you'll need to:
- Accept the terms of service of the Deepfloyd XL model here.
- Retrieve your Huggingface token from the settings.
- Set your Huggingface token as a Baseten secret here with the key
hf_access_token
.
First, clone this repository:
git clone https://github.com/basetenlabs/truss-examples/
cd deepfloyd-xl-truss
Before deployment:
- Make sure you have a Baseten account and API key.
- Install the latest version of Truss:
pip install --upgrade truss
With deepfloyd-xl-truss
as your working directory, you can deploy the model with:
truss push --trusted
Paste your Baseten API key if prompted.
For more information, see Truss documentation.
This deployment of DeepFloyd takes a dictionary as input, which requires the following key:
prompt
- the prompt for image generation
It also supports a number of other parameters detailed in this blog post.
The result will be a dictionary containing:
status
- eithersuccess
orfailed
data
- list of base 64 encoded imagesmessage
- will contain details in the case of errors
{
"status": "success",
"data": ["/9j/4AAQSkZJRgABAQAAAQABAA...."],
"message": null
}
truss predict -d '{"prompt": "man on moon"}'
You can also invoke it via cURL:
curl -X POST https://app.baseten.co/models/EqwKvqa/predict \
-H 'Authorization: Api-Key {YOUR_API_KEY}' \
-d '{"prompt": "man on moon"}'