Skip to content

Latest commit

 

History

History
49 lines (43 loc) · 1.92 KB

TRAINING.md

File metadata and controls

49 lines (43 loc) · 1.92 KB

Training DEVA

Note that this repository only supports the training of the temporal propagation module. For the image module, please refer to the individual projects.

Setting Up Data

We put datasets out-of-source, as in XMem. You do not need BL30K. The directory structure should look like this:

├── Tracking-Anything-with-DEVA
├── DAVIS
│   ├── 2016
│   │   ├── Annotations
│   │   └── ...
│   └── 2017
│       ├── test-dev
│       │   ├── Annotations
│       │   └── ...
│       └── trainval
│           ├── Annotations
│           └── ...
├── static
│   ├── BIG_small
│   └── ...
└── YouTube
│   ├── all_frames
│   │   └── valid_all_frames
│   ├── train
│   └── valid
└── OVIS-VOS-train
    ├── JPEGImages
    └── Annotations

You can try our script python -m scripts.download_dataset which might not work 100% of the time due to Google Drive's blocking. If it fails, please download the datasets manually. The links can be found in the script.

To generate OVIS-VOS-train, use something like https://github.com/youtubevos/vis2vos or download our preprocessed version from https://drive.google.com/uc?id=1AZPyyqVqOl6j8THgZ1UdNJY9R1VGEFrX.

Training Command

The training command is the same as in XMem. We tried training with 4/8 GPUs. With 8 GPUs,

python -m torch.distributed.run --master_port 25763 --nproc_per_node=8 deva/train.py --exp_id deva_retrain --stage 03
  • Change nproc_per_node to change the number of GPUs.
  • Prepend CUDA_VISIBLE_DEVICES=... if you want to use specific GPUs.
  • Change master_port if you encounter port collision.
  • exp_id is a unique experiment identifier that does not affect how the training is done.
  • Models will be saved in ./saves/.
  • We simply use the last trained model without model selection.