Skip to content

Files

Latest commit

 

History

History

deepfloyd-xl

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

DeepFloyd XL Truss

This is a Truss for DeepFloyd-IF. DeepFloyd-IF is a pixel-based text-to-image triple-cascaded diffusion model that can generate pictures and sets a new state-of-the-art for photorealism and language understanding. The result is a highly efficient model that outperforms current state-of-the-art models, achieving a zero-shot FID-30K score of 6.66 on the COCO dataset.

Model details:

  • Developed by: DeepFloyd, StabilityAI
  • Model type: pixel-based text-to-image cascaded diffusion model
  • Cascade Stage: I
  • Num Parameters: 4.3B
  • Language(s): primarily English and, to a lesser extent, other Romance languages
  • License: DeepFloyd IF License Agreement
  • Model Description: DeepFloyd-IF is modular composed of frozen text mode and three pixel cascaded diffusion modules, each designed to generate images of increasing resolution: 64x64, 256x256, and 1024x1024. All stages of the model utilize a frozen text encoder based on the T5 transformer to extract text embeddings, which are then fed into a UNet architecture enhanced with cross-attention and attention-pooling

Before deploying this model, you'll need to:

  1. Accept the terms of service of the Deepfloyd XL model here.
  2. Retrieve your Huggingface token from the settings.
  3. Set your Huggingface token as a Baseten secret here with the key hf_access_token.

Deploying DeepFloyd XL

First, clone this repository:

git clone https://github.com/basetenlabs/truss-examples/
cd deepfloyd-xl-truss

Before deployment:

  1. Make sure you have a Baseten account and API key.
  2. Install the latest version of Truss: pip install --upgrade truss

With deepfloyd-xl-truss as your working directory, you can deploy the model with:

truss push --trusted

Paste your Baseten API key if prompted.

For more information, see Truss documentation.

DeepFloyd API documentation

Input

This deployment of DeepFloyd takes a dictionary as input, which requires the following key:

  • prompt - the prompt for image generation

It also supports a number of other parameters detailed in this blog post.

Output

The result will be a dictionary containing:

  • status - either success or failed
  • data - list of base 64 encoded images
  • message - will contain details in the case of errors
{
  "status": "success",
  "data": ["/9j/4AAQSkZJRgABAQAAAQABAA...."],
  "message": null
}

Example usage

truss predict -d '{"prompt": "man on moon"}'

You can also invoke it via cURL:

curl -X POST https://app.baseten.co/models/EqwKvqa/predict \
  -H 'Authorization: Api-Key {YOUR_API_KEY}' \
  -d '{"prompt": "man on moon"}'