FViT

This is the code implementation for the paper Improving Interpretation Faithfulness for Vision Transformers published at ICML 2024 Spotlight.

Environment Setup

Please set up the environment following env.yaml by using conda. After that, please clone https://github.com/openai/guided-diffusion and put it in the same directory as this repo. Note that we leverage the pre trained diffusion model of 256x256 diffusion (not class conditional): 256x256_diffusion_uncond.pt.

Playing with examples

We provide some examples in fig folder. You can run the corresponding experiments to see different explanation results in the fvit-demo.ipynb notebook.

Ours

VTA

References

Our code implementation is based on the following awesome material:

https://github.com/jacobgil/vit-explain
https://jacobgil.github.io/deeplearning/vision-transformer-explainability
https://arxiv.org/abs/2005.00928
https://github.com/hila-chefer/Transformer-Explainability
https://github.com/openai/guided-diffusion

Citation

All rights belong to the authors. Suggest cite:

@inproceedings{huimproving,
title={Improving Interpretation Faithfulness for Vision Transformers},
author={Hu, Lijie and Liu, Yixin and Liu, Ninghao and Huai, Mengdi and Sun, Lichao and Wang, Di},
booktitle={Forty-first International Conference on Machine Learning}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

FViT

Environment Setup

Playing with examples

Ours

VTA

References

Citation

Files

README.md

Latest commit

History

README.md

File metadata and controls

FViT

Environment Setup

Playing with examples

Ours

VTA

References

Citation