OpenSeeD

This is the official implementation of the paper "A Simple Framework for Open-Vocabulary Segmentation and Detection".

openseed_9.4m.mp4

You can also find the more detailed demo at video link on Youtube.

👉 [New] demo code is available 👉 [New] OpenSeeD has been accepted to ICCV 2023! training code is available!

🚀 Key Features

A Simple Framework for Open-Vocabulary Segmentation and Detection.
Support interactive segmentation with box input to generate mask.

💡 Installation

pip3 install torch==1.13.1 torchvision==0.14.1 --extra-index-url https://download.pytorch.org/whl/cu113
python -m pip install 'git+https://github.com/MaureenZOU/detectron2-xyz.git'
pip install git+https://github.com/cocodataset/panopticapi.git
python -m pip install -r requirements.txt
export DATASET=/pth/to/dataset

Download the pretrained checkpoint from here.

💡 Demo script

python demo/demo_panoseg.py evaluate --conf_files configs/openseed/openseed_swint_lang.yaml  --image_path images/animals.png --overrides WEIGHT /path/to/ckpt/model_state_dict_swint_51.2ap.pt

🔥 Remember to modify the vocabulary thing_classes and stuff_classes in demo_panoseg.py if your want to segment open-vocabulary objects.

Evaluation on coco

python train_net.py --original_load --eval_only --num-gpus 8 --config-file configs/openseed/openseed_swint_lang.yaml MODEL.WEIGHTS=[/path/to/lang/weight](https://github.com/IDEA-Research/OpenSeeD/releases/download/openseed/model_state_dict_swint_51.2ap.pt)

You are expected to get 55.4 PQ.

💡 Some coco-format data

Here is the coco-format json file for evaluating BDD and SUN.

Training OpenSeeD baseline

Training on coco

python train_net.py --num-gpus 8 --config-file configs/openseed/openseed_swint_lang.yaml --lang_weight [/path/to/lang/weight](https://github.com/IDEA-Research/OpenSeeD/releases/download/training/model_state_dict_only_language.pt)

Training on coco+o365

python train_net.py --num-gpus 8 --config-file configs/openseed/openseed_swint_lang_o365.yaml --lang_weight [/path/to/lang/weight](https://github.com/IDEA-Research/OpenSeeD/releases/download/training/model_state_dict_only_language.pt)

Checkpoints

Swin-T model trained on COCO panoptic segmentation and Objects365 weights.
Swin-L model fine-tuned on COCO panoptic segmentation weights.
Swin-L model fine-tuned on ADE20K semantic segmentation weights.

🦄 Model Framework

🌋 Results

Results on open segmentation Results on task transfer and segmentation in the wild

Citing OpenSeeD

If you find our work helpful for your research, please consider citing the following BibTeX entry.

@article{zhang2023simple,
  title={A Simple Framework for Open-Vocabulary Segmentation and Detection},
  author={Zhang, Hao and Li, Feng and Zou, Xueyan and Liu, Shilong and Li, Chunyuan and Gao, Jianfeng and Yang, Jianwei and Zhang, Lei},
  journal={arXiv preprint arXiv:2303.08131},
  year={2023}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

OpenSeeD

🚀 Key Features

💡 Installation

💡 Demo script

💡 Some coco-format data

Training OpenSeeD baseline

Checkpoints

🦄 Model Framework

🌋 Results

Citing OpenSeeD

Files

README.md

Latest commit

History

README.md

File metadata and controls

OpenSeeD

🚀 Key Features

💡 Installation

💡 Demo script

💡 Some coco-format data

Training OpenSeeD baseline

Checkpoints

🦄 Model Framework

🌋 Results

Citing OpenSeeD