Zero Shot Image Segmentation using OpenAI's CLIP

Code to perform Zero Shot Segmentation using CLIP's Encoder and U-Net Decoder based on text-prompts trained on PhraseCutDataset

Instructions to Run

Clone PhraseCut Dataset
Download dataset with images using python download_dataset.py in PhraseCut Directory
Run Generator to split data
Run Zero_Shot_Segmentation_Clip

Model accepts a text prompt and an image as input
CLIP is used to create an embedding of the text prompt and the input image
U-Net decoder is trained on PhraseCut Dataset to produce a binary segmentation map

A Basic code to perform segmentation using ClipSeg can be found here

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
README.md		README.md
Zero_Shot_Segmentation_Clip.ipynb		Zero_Shot_Segmentation_Clip.ipynb
data_train.csv		data_train.csv
data_val.csv		data_val.csv
dataloader.py		dataloader.py
decoder.py		decoder.py
encoder.py		encoder.py
eval.png		eval.png
generator.py		generator.py