Skip to content

Latest commit

 

History

History
104 lines (84 loc) · 3.2 KB

README.MD

File metadata and controls

104 lines (84 loc) · 3.2 KB

Generation of crops from the real datasets

The instructions below allow to generate the crops used for pre-training CroCo v2 from the following real-world datasets: ARKitScenes, MegaDepth, 3DStreetView and IndoorVL.

Download the metadata of the crops to generate

First, download the metadata and put them in ./data/:

mkdir -p data
cd data/
wget https://download.europe.naverlabs.com/ComputerVision/CroCo/data/crop_metadata.zip
unzip crop_metadata.zip
rm crop_metadata.zip
cd ..

Prepare the original datasets

Second, download the original datasets in ./data/original_datasets/.

mkdir -p data/original_datasets
ARKitScenes

Download the raw dataset from https://github.com/apple/ARKitScenes/blob/main/DATA.md and put it in ./data/original_datasets/ARKitScenes/. The resulting file structure should be like:

./data/original_datasets/ARKitScenes/
└───Training
    └───40753679
     │  │   ultrawide
     │  │   ...
     └───40753686
     │   
      ...
MegaDepth

Download MegaDepth v1 Dataset from https://www.cs.cornell.edu/projects/megadepth/ and put it in ./data/original_datasets/MegaDepth/. The resulting file structure should be like:

./data/original_datasets/MegaDepth/
└───0000
│   └───images
│    │      │   1000557903_87fa96b8a4_o.jpg
│    │      └ ...
│    └─── ...
└───0001
│   │   
│   └ ...
└─── ...
3DStreetView

Download 3D_Street_View dataset from https://github.com/amir32002/3D_Street_View and put it in ./data/original_datasets/3DStreetView/. The resulting file structure should be like:

./data/original_datasets/3DStreetView/
└───dataset_aligned
│   └───0002
│    │      │   0000002_0000001_0000002_0000001.jpg
│    │      └ ...
│    └─── ...
└───dataset_unaligned
│   └───0003
│    │      │   0000003_0000001_0000002_0000001.jpg
│    │      └ ...
│    └─── ...
IndoorVL

Download the IndoorVL datasets using Kapture.

pip install kapture
mkdir -p ./data/original_datasets/IndoorVL
cd ./data/original_datasets/IndoorVL
kapture_download_dataset.py update
kapture_download_dataset.py install  "HyundaiDepartmentStore_*"
kapture_download_dataset.py install  "GangnamStation_*"
cd -

Extract the crops

Now, extract the crops for each of the dataset:

for dataset in ARKitScenes MegaDepth 3DStreetView IndoorVL; 
do 
  python3 datasets/crops/extract_crops_from_images.py --crops ./data/crop_metadata/${dataset}/crops_release.txt --root-dir ./data/original_datasets/${dataset}/ --output-dir ./data/${dataset}_crops/ --imsize 256 --nthread 8 --max-subdir-levels 5 --ideal-number-pairs-in-dir 500;
done
Note for IndoorVL

Due to some legal issues, we can only release 144,228 pairs out of the 1,593,689 pairs used in the paper. To account for it in terms of number of pre-training iterations, the pre-training command in this repository uses 125 training epochs including 12 warm-up epochs and learning rate cosine schedule of 250, instead of 100, 10 and 200 respectively. The impact on the performance is negligible.