The official repository of the paper FactoFormer: Factorized Hyperspectral Transformers with Self-Supervised Pretraining Published at IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1-14, 2024, Art no. 5501614, doi: 10.1109/TGRS.2023.3343392.
Hyperspectral images (HSIs) contain rich spectral and spatial information. Motivated by the success of transformers in the field of natural language processing and computer vision where they have shown the ability to learn long range dependencies within input data, recent research has focused on using transformers for HSIs. However, current state-of-the-art hyperspectral transformers only tokenize the input HSI sample along the spectral dimension, resulting in the under-utilization of spatial information. Moreover, transformers are known to be data-hungry and their performance relies heavily on large-scale pretraining, which is challenging due to limited annotated hyperspectral data. Therefore, the full potential of HSI transformers has not been fully realized. To overcome these limitations, we propose a novel factorized spectral-spatial transformer that incorporates factorized self-supervised pretraining procedures, leading to significant improvements in performance. The factorization of the inputs allows the spectral and spatial transformers to better capture the interactions within the hyperspectral data cubes. Inspired by masked image modeling pretraining, we also devise efficient masking strategies for pretraining each of the spectral and spatial transformers. We conduct experiments on six publicly available datasets for HSI classification task and demonstrate that our model achieves state-of-the-art performance in all the datasets.
- [2024-06] Pretraining code released
- [2023-12] Finetuning and testing code released with pretrained models.
Set up the environment and install required packages
- Create conda environment with python:
conda create --name factoformer python=3.7
conda activate factoformer
- Install PyTorch with suitable cudatoolkit version. See here:
conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.6 -c pytorch -c nvidia
- Install other requirements:
pip install -r requirements.txt
Download datasets and pretrained checkpoints
- Download Indian Pines, University of Pavia and Houston datasets using the link provided in SpectralFormer
- Download Wuhan datasets with .mat file format from here. (download the split with 100 samples per class)
- Download our pretrained and finetuned checkpoints from the links provided in the following table.
Dataset | Overall Acc. (%) | Average Acc. (%) | Pretrained Model | Finetuned Model | ||
---|---|---|---|---|---|---|
Indian Pines | 91.30 | 94.30 | spatial_ckpt | spectral_ckpt | finetuned_ckpt | |
University of Pavia | 95.19 | 93.64 | spatial_ckpt | spectral_ckpt | finetuned_ckpt | |
Houston 2013 | 89.13 | 90.12 | spatial_ckpt | spectral_ckpt | finetuned_ckpt | |
WHU-Hi-LongKou | 98.30 | 98.72 | spatial_ckpt | spectral_ckpt | finetuned_ckpt | |
WHU-Hi-HanChuan | 93.19 | 91.64 | spatial_ckpt | spectral_ckpt | finetuned_ckpt | |
WHU-Hi-HongHu | 92.26 | 92.38 | spatial_ckpt | spectral_ckpt | finetuned_ckpt |
For recreating the results reported in the paper using the finetuned checkpoints:
- eg. Running evaluation with Indian Pines dataset
python test.py --dataset='Indian' --model_path='<path_to_ckpt>'
For evaluatng on other datasets change the --dataset
argument to Pavia
, Houston
, WHU-Hi-HC
, WHU-Hi-HH
, WHU-Hi-LK
and replace <path_to_ckpt>
with the path to the relevant checkpoint.
For finetuning FactoFormer using the pretrained models:
- Indian Pines:
python main_finetune.py --dataset='Indian' --epochs=80 --learning_rate=3e-4 --pretrained_spectral='<path_to_ckpt>' --pretrained_spatial='<path_to_ckpt>' --output_dir='<path_to_out_dir>'
- University of Pavia:
python main_finetune.py --dataset='Pavia' --epochs=80 --learning_rate=1e-3 --pretrained_spectral='<path_to_ckpt>' --pretrained_spatial='<path_to_ckpt>' --output_dir='<path_to_out_dir>'
- Houston:
python main_finetune.py --dataset='Houston' --epochs=40 --learning_rate=2e-3 --pretrained_spectral='<path_to_ckpt>' --pretrained_spatial='<path_to_ckpt>' --output_dir='<path_to_out_dir>'
- Wuhan has three datasets namely WHU-Hi-HanChuan, WHU-Hu-HongHu and WHU-Hi-LongKou. Use the following snippet and change the
--dataset
argument toWHU-Hi-HC
,WHU-Hi-HH
andWHU-Hi-LK
for fune-tuning on each dataset:python main_finetune.py --dataset='WHU-Hi-HC' --epochs=40 --learning_rate=1e-3 --pretrained_spectral='<path_to_ckpt>' --pretrained_spatial='<path_to_ckpt>' --output_dir='<path_to_out_dir>'
Replace <path_to_out_dir>
with the relevant path to the pretrained checkpoints and replace <path_to_out_dir>
with the path to intended output directory
For pretraining spatial and spectral transformers, navigate to the pretraining
folder.
- eg. Pretraining the spatial transformer with Indian Pines dataset.
python main_pretrain.py --dataset='Indian' --pretrain_mode='spatial' --output_dir='<path_to_save_spatial_model>
To pretrain the spectral transformer, replace pretrain_mode
to spectral
and change the output_dir
to avoid overwriting.
- eg. Pretraining the spectral transformer with Indian Pines dataset.
python main_pretrain.py --dataset='Indian' --pretrain_mode='spectral' --output_dir='<path_to_save_spectral_model>
By replacing the dataset
in the above example, you can pretrain the models with other datasets.
Please use the following bibtex reference to cite our paper.
@ARTICLE{FactoFormer,
author={Mohamed, Shaheer and Haghighat, Maryam and Fernando, Tharindu and Sridharan, Sridha and Fookes, Clinton and Moghadam, Peyman},
journal={IEEE Transactions on Geoscience and Remote Sensing},
title={FactoFormer: Factorized Hyperspectral Transformers With Self-Supervised Pretraining},
year={2024},
volume={62},
number={},
pages={1-14},
doi={10.1109/TGRS.2023.3343392}}
We would like acknowledge the following repositories: SpectralFormer, MAEST and SimMIM.