GitHub - Yux1angJi/DIFF: Official Implementation for Diffusion Features to Bridge Domain Gap for Semantic Segmentation

Diffusion Features to Bridge Domain Gap for Semantic Segmentation

by Yuxiang Ji*, Boyong He*, Chenyuan Qu, Zhuoyue Tan, Chuan Qin, Liaoni Wu

Overview

Pre-trained diffusion models have demonstrated remarkable proficiency in synthesizing images across a wide range of scenarios with customizable prompts, indicating their effective capacity to capture universal features. Motivated by this, our study delves into the utilization of the implicit knowledge embedded within diffusion models to address challenges in cross-domain semantic segmentation. This paper investigates the approach that leverages the sampling and fusion techniques to harness the features of diffusion models efficiently. We propose DIffusion Feature Fusion (DIFF) as a backbone use for extracting and integrating effective semantic representations through the diffusion process. By leveraging the strength of text-to-image generation capability, we introduce a new training framework designed to implicitly learn posterior knowledge from it.

Relying on diffusion-based encoder, our approach improves the previous state-of-the-art performance by 2.7 mIoU for GTA→Cityscapes, by 4.98 mIoU for GTA→ACDC, by 11.69 mIoU for GTA→Dark Zurich.

[arXiv]

Setup Environment

For this project, we used python 3.8.18. We recommend setting up a new virtual environment:

python -m venv ~/venv/diff
source ~/venv/diff/bin/activate

In that environment, the requirements can be installed with:

pip install -r requirements.txt -f https://download.pytorch.org/whl/torch_stable.html
pip install mmcv-full==1.3.7  # requires the other packages to be installed first

Further, please download the Stable-Diffusion v2-1 weights from HuggingFace. Please refer to the instruction at Stable-Diffusion.

All experiments were executed on a NVIDIA RTX A6000.

Setup Datasets

Cityscapes: Please, download leftImg8bit_trainvaltest.zip and gt_trainvaltest.zip from here and extract them to data/cityscapes.

GTA: Please, download all image and label packages from here and extract them to data/gta.

More details of dataset preparation could be referred at DAFormer.

Training

To run a simple experiment on GTA→Cityscapes

python run_experiments.py --exp 50

More information about the available configuration and experiments, can be found in diff_config.yaml.

If you want to utilize DIFF module as a backbone for other tasks, you could simply copy the whole directory mmseg/models/backbones/diff and use DIFFEncoder in here.

Acknowledgements

This project is based on the following open-source projects. We thank their authors for making the source code publically available.

Citation

If you find our work useful in your research, please consider citing:

@misc{ji2024diffusion,
    title={Diffusion Features to Bridge Domain Gap for Semantic Segmentation},
    author={Yuxiang Ji and Boyong He and Chenyuan Qu and Zhuoyue Tan and Chuan Qin and Liaoni Wu},
    year={2024},
    eprint={2406.00777},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
configs		configs
mmseg		mmseg
resources		resources
tools		tools
.gitignore		.gitignore
README.md		README.md
experiments.py		experiments.py
requirements.txt		requirements.txt
run_experiments.py		run_experiments.py
setup.cfg		setup.cfg
test.sh		test.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Diffusion Features to Bridge Domain Gap for Semantic Segmentation

Overview

Setup Environment

Setup Datasets

Training

Acknowledgements

Citation

About

Releases

Packages

Languages

Yux1angJi/DIFF

Folders and files

Latest commit

History

Repository files navigation

Diffusion Features to Bridge Domain Gap for Semantic Segmentation

Overview

Setup Environment

Setup Datasets

Training

Acknowledgements

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages