by Gyuseong Lee*, Wooseok Jang*, Jin Hyeon Kim, Jaewoo Jung, Seungryong Kim†
Dataset Preparation
python -m domainbed.scripts.download --data_dir=/my/datasets/path
Environment Setup
conda create -n MoA python=3.9.12
conda activate MoA
pip install -r requirements.txt
We use OpenCLIP ViT-B/16 for all experiments. The pretrained model can be loaded from timm. You can use the following command to get the model.
timm.create_model('vit_base_patch16_clip_224.laion2b', pretrained=True)
Full fine-tuning
python train_all.py [train_name] --data_dir [domainbed_data_dir] --algorithm ERM \
--dataset DomainNet --model vitbase --seed 1
python train_all.py [train_name] --data_dir [domainbed_data_dir] --algorithm ERM \
--dataset DomainNet --model nf_vitbase_lora --r 2 --seed 1
Mixture-of-LoRA
python train_all.py [train_name] --data_dir [domainbed_data_dir] --algorithm ERM \
--dataset DomainNet --model nf_vitbase_moelora_last_qkv --seed 1
KAdaptation + Mixture-of-Attention (our best results)
python train_all.py nf_vitbase_moelora_every_qkv_new_laux --data_dir [domainbed_data_dir] --algorithm ERM \
--dataset DomainNet --model nf_vitbase_moek_every_qkv_new --l_aux --seed 1
This code is heavily based on MIRO, SWAD and DomainBed. Also, the LoRA implementation is based on LoRA. We also used the official implementation of KAdaptation, and the Cosine Router using this github. We highly appreciate the authors for their great work.
If you found this code useful, please consider citing our paper.
@article{lee2023domain,
title={Domain Generalization Using Large Pretrained Models with Mixture-of-Adapters},
author={Lee, Gyuseong and Jang, Wooseok and Kim, Jin Hyeon and Jung, Jaewoo and Kim, Seungryong},
journal={arXiv preprint arXiv:2310.11031},
year={2023}
}