kapture-localization / tutorial

Table of Contents

Structure-based mapping and localization
Image retrieval benchmark
- Introduction
- Benchmark

Structure-based mapping and localization

In this tutorial, we us two pipelines (custom pipeline and COLMAP pipeline) to explain how to localize query images within a map. Before explaining how to use the code, we provide more details about the algorithmic workflow and the methods used.

For both pipelines (custom and COLMAP), we explain how to build a map using structure-from-motion leveraging known camera poses, we show how to localize query images within this map, and we show how to evaluate the precision of the obtained localization against the ground truth.

We will use the virtual_gallery_tutorial dataset as an example. It is a subset of the virtual gallery dataset. The sample dataset can be found in the samples/ folder. The tutorial can be used for any dataset that follows the Recommended dataset structure. Further information about the pipelines can be found in pipeline/README.

If you use this work for your research, please cite the respective paper:

@misc{kapture2020,
      title={Robust Image Retrieval-based Visual Localization using Kapture},
      author={Martin Humenberger and Yohann Cabon and Nicolas Guerin and Julien Morat and Jérôme Revaud and Philippe Rerole and Noé Pion and Cesar de Souza and Vincent Leroy and Gabriela Csurka},
      year={2020},
      eprint={2007.13867},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Algorithmic workflow

In this section, we present the algorithmic workflow of our pipelines. In the custom pipeline, we follow a structure-based approach using image retrieval (any compatible global image representation can be used) to select the image pairs to match (using any compatible local feature type). In this pipeline, we use COLMAP for point triangulation and image registration. More details about the method can be found in our paper. In the COLMAP pipeline, we follow the COLMAP workflow which uses a vocabulary tree matcher and SIFT features. For more details about the COLMAP workflow, please refer to the documentation and this repository.

Fig. Map generation with SFM. Images from SceauxCastle Dataset. illustrates the algorithmic steps of the mapping part:

Extraction of local descriptors and keypoints (e.g. R2D2, D2-Net, SIFT) of training images
Extraction of global features (e.g. AP-GeM, NetVLAD, DenseVLAD) of training images
Computation of training image pairs using global features
Local descriptor matching and geometric verification of the image pairs
Point triangulation with COLMAP (point triangulator)

Figure 1. Map generation with SFM. Images from SceauxCastle Dataset.

Fig. Localization of query images in an SFM map. Images from SceauxCastle Dataset. illustrates the algorithmic steps of the localization part:

Extraction of local and global features (same local and global feature types as used for mapping) of query images
Retrieval of similar images from the training images (training-query image pairs)
Local descriptor matching and geometric verification
Camera pose estimation with COLMAP (image registrator)

Figure 2. Localization of query images in an SFM map. Images from SceauxCastle Dataset.

Recommended dataset structure

The pipeline scripts use the same locations of keypoints, descriptors, global features, and matches for multiple kapture folders. Note, a kapture folder contains sensor data and, if available, reconstruction data. More details about the structure of a kapture folder and the data it contains can be found in the kapture repository.

In order to share reconstruction data across different kapture folders, this data has to be stored outside of any kapture folder and the pipeline scripts will use these locations to assemble the correct kapture folders using symlinks.

The following example contains 3 kapture folders (mapping, query, map_plus_query) and 2 locations for shared reconstruction data (local_features, global_features). Note, since keypoint matches depend on the local feature type, they are stored in the respective subfolder. For example, r2d2_WASF-N8_20k/NN_no_gv means matches without geometric verification of 20k R2D2 features extracted using the model named r2d2_WASF-N8. Of course the naming of the matching method and the local features is arbitrary.

my_dataset
├─ mapping
│  └─ sensors
│     ├─ sensors.txt          # list of all sensors with their specifications (e.g. camera intrinsics)
│     ├─ trajectories.txt     # extrinsics (timestamp, sensor, pose)
│     ├─ records_camera.txt   # all records of type 'camera' (timestamp, sensor and path to image)
│     └─ records_data/        # image data path
├─ query
│  └─ sensors
│     ├─ sensors.txt          # list of all sensors with their specifications (e.g. camera intrinsics)
│     ├─ trajectories.txt     # if available: extrinsics (timestamp, sensor, pose)
│     ├─ records_camera.txt   # all records of type 'camera' (timestamp, sensor and path to image)
│     └─ records_data/        # image data path
├─ map_plus_query # kapture_merge.py with mapping/query inputs
│  └─ sensors
│     ├─ sensors.txt          # list of all sensors with their specifications (e.g. camera intrinsics)
│     ├─ trajectories.txt     # extrinsics (timestamp, sensor, pose)
│     ├─ records_camera.txt   # all records of type 'camera' (timestamp, sensor and path to image)
│     └─ records_data/        # image data path
├─ local_features
│  ├─ r2d2_WASF-N8_20k # 20k R2D2 with model r2d2_WASF-N8
│  │  ├─ keypoints/
│  │  ├─ descriptors/
│  │  ├─ NN_no_gv # match method, here: cross validation without geometric verification
│  │  │  └─ matches/
│  │  └─ NN_colmap_gv/ # match method, here: cross validation with COLMAP geometric verification
│  │     └─ matches/
│  └─ d2_tf
│     ├─ keypoints/
│     ├─ descriptors/
│     ├─ NN_no_gv # match method (see above)
│     │  └─ matches/
│     └─ NN_colmap_gv/ # match method (see above)
│        └─ matches/
└─ global_features
   └─ AP-GeM-LM18 # APGeM features with model AP-GeM-LM18
      └─ global_features

Install kapture-localization

See installation.adoc for more details.

For Windows users: Please use colmap.bat. If the colmap path is not available from your %PATH% environment variable, you have to provide it to kapture tools through the parameter -colmap, e.g. -colmap C:/Workspace/dev/colmap/colmap.bat.

Warning

Windows users need to have the file extension .py associated to the python3.6 executable and elevated rights to allow symlink. They should also enable long paths. See installation instructions for more details.

using docker

docker run --runtime=nvidia -it --rm  kapture/kapture-localization
cd kapture-localization

Prepare data

Before going through the kapture pipelines, local features and global features have to be extracted for each image.

precomputed features

For easy use of this tutorial, we provide precomputed local and global features (virtual_gallery_tutorial):

local features: R2D2 (500 kps per image), stored in ./local_features/r2d2_500/{descriptors,keypoints}.
global features: AP-GeM, stored in ./global_features/AP-GeM-LM18/global_features/.

extract own local features

Custom local features in the kapture format can be used as well. For example, R2D2 features can be extracted using extract_kapture.py provided in the R2D2 git repository. See here for more local feature types that are directly supported in kapture.

extract own global features

Custom global features in the kapture format can be used as well. For example, AP-GeM global features can be extracted using extract_kapture.py provided in the deep-image-retrieval git repository See here for more global feature types that are directly supported in kapture.

Features for mapping and query images need to be in the same folder (see the Recommended dataset structure above).

previous experiments

To be sure starting from scratch, unwanted files (e.g. previous experiments) should be deleted before running this tutorial.

cd samples/virtual_gallery_tutorial
./reset_tutorial_folder.py

Next, we will introduce two mapping and localization pipelines. The first one is a custom-built pipeline that can be used with any local or global feature type as well as custom keypoint matches, the second one is fully based on COLMAP and shows how COLMAP can be used with data provided in kapture format.

Kapture pipeline (custom)

1. Mapping

cd samples/virtual_gallery_tutorial # or a custom dataset
# if the COLMAP executable is not available from PATH,
# parameter -colmap needs to be set. example -colmap C:/Workspace/dev/colmap/colmap.bat
kapture_pipeline_mapping.py -v info \
    -i ./mapping \
    -kpt ./local_features/r2d2_500/keypoints \
    -desc ./local_features/r2d2_500/descriptors \
    -gfeat ./global_features/AP-GeM-LM18/global_features \
    -matches ./local_features/r2d2_500/NN_no_gv/matches \
    -matches-gv ./local_features/r2d2_500/NN_colmap_gv/matches \
    --colmap-map ./colmap-sfm/r2d2_500/AP-GeM-LM18_top5  # lfeat type / map pairs \
    --topk 5

kapture_pipeline_mapping.py will run the following sequence:

kapture_compute_image_pairs.py: associate similar images within the mapping set
kapture_compute_matches.py: compute 2D-2D matches using local features and the list of pairs
kapture_run_colmap_gv.py: run COLMAP geometric verification on the 2D-2D matches
kapture_colmap_build_map.py triangulate the 2D-2D matches to get 3D points and 2D-3D observations

The resulting list of image pairs and the 3D reconstruction (map) can be found in ./colmap-sfm/r2d2_500/AP-GeM-LM18_top5.

The map you can visualized using the COLMAP gui as follows:

colmap gui \
    --database_path ./colmap-sfm/r2d2_500/AP-GeM-LM18_top5/colmap.db \
    --image_path ./mapping/sensors/records_data \
    --import_path ./colmap-sfm/r2d2_500/AP-GeM-LM18_top5/reconstruction/ # only available in colmap 3.6

Note	For Windows user, replace "colmap" with the full path to "colmap.bat", as described in Install kapture-localization.

Note	For older versions of COLMAP (< 3.6) the model needs to be imported manually: menu `file` > `import model` > browse to `colmap-sfm/r2d2_500/AP-GeM-LM18_top5/reconstruction` > click `yes` and `save` in the following two dialogs.

As show in Fig. Map reconstruction in COLMAP., the 3D interface of COLMAP shows the 3D points and the cameras in the scene. A double-click on a camera will show the image and the observed 3D points will be highlighted.

Note	If you are using docker, you can simply use COLMAP GUI from host, even if the version is < 3.6.

Figure 3. Map reconstruction in COLMAP.

2. Localization

# If the COLMAP executable is not available from PATH, the parameter -colmap needs to be set
#   example: -colmap C:/Workspace/dev/colmap/colmap.bat
# For RobotCar or RobotCar_v2 --benchmark-style RobotCar_Seasons needs to be added.
# For Gangnam_Station --benchmark-style Gangnam_Station
# For Hyundai_Department_Store --benchmark-style Hyundai_Department_Store
# For RIO10 --benchmark-style RIO10
# For ETH-Microsoft --benchmark-style ETH_Microsoft
kapture_pipeline_localize.py -v info \
      -i ./mapping \
      --query ./query \
      -kpt ./local_features/r2d2_500/keypoints \
      -desc ./local_features/r2d2_500/descriptors \
      -gfeat ./global_features/AP-GeM-LM18/global_features \
      -matches ./local_features/r2d2_500/NN_no_gv/matches \
      -matches-gv ./local_features/r2d2_500/NN_colmap_gv/matches \
      --colmap-map ./colmap-sfm/r2d2_500/AP-GeM-LM18_top5 \
      -o ./colmap-localization/r2d2_500/AP-GeM-LM18_top5/AP-GeM-LM18_top5/ \
      --topk 5 \
      --config 2

kapture_pipeline_localize.py will run the following sequence:

kapture_compute_image_pairs.py associates similar images between the mapping and query sets
kapture_merge.py merges the mapping and query sensors into the same folder (necessary to compute shared matches)
kapture_compute_matches.py computes 2D-2D matches using local features and the list of pairs
kapture_run_colmap_gv.py runs geometric verification on the 2D-2D matches
kapture_colmap_localize.py runs the camera pose estimation
kapture_import_colmap.py imports the COLMAP results into kapture
kapture_evaluate.py if query ground truth is available, this evaluates the localization results
kapture_export_LTVL2020.py exports the localized images to a format compatible with the https://www.visuallocalization.net/ benchmark

In this script, the --config option will decide the parameters passed to the COLMAP image_registrator. The parameters are described in colmap_command.py.

The resulting ./colmap-localization/r2d2_500/AP-GeM-LM18_top5/AP-GeM-LM18_top5/eval/stats.txt will look similar to:

Model: colmap_config_2

Found 4 / 4 image positions (100.00 %).
Found 4 / 4 image rotations (100.00 %).
Localized images: mean=(0.0124m, 0.2086 deg) / median=(0.0110m, 0.1675 deg)
All: median=(0.0110m, 0.1675 deg)
Min: 0.0030m; 0.0539 deg
Max: 0.0246m; 0.4454 deg

(0.25m, 2.0 deg): 100.00%
(0.5m, 5.0 deg): 100.00%
(5.0m, 10.0 deg): 100.00%

If the dataset used is part of the online benchmark (not the case for virtual gallery), ./colmap-localization/r2d2_500/AP-GeM-LM18_top5/AP-GeM-LM18_top5/LTVL2020_style_result.txt contains the results in compatible format.

To visualise the queries in the map, COLMAP gui can be used as follows:

colmap gui \
    --database_path ./colmap-localization/r2d2_500/AP-GeM-LM18_top5/AP-GeM-LM18_top5/colmap_localized/colmap.db \
    --image_path query/sensors/records_data \
    --import_path ./colmap-localization/r2d2_500/AP-GeM-LM18_top5/AP-GeM-LM18_top5/colmap_localized/reconstruction/ # only available in colmap 3.6

Figure 4. Query localized in COLMAP.

Examples

This section presents examples of how to use the custom pipeline with some public datasets. To use these examples with other datasets that are available in kapture format, only very little adaptions are needed (some parameters need to be changed; please see the documentation of the source code of the functions used for more details).

The example scripts can be found in kapture-localization/pipeline/examples.

We will use the pre-built docker container for these examples.

docker pull kapture/kapture-localization
docker run --runtime=nvidia -it --rm --volume <my_data>:<my_data> kapture/kapture-localization

The path specified in WORKING_DIR (defined in the scripts) can be the same for all examples. There will be a subfolder that contains the downloaded datasets and a subfolder that contains the processed data for each example.

NAVER LABS Localization datasets (Gangnam Station and Hyundai Department Store)

1) Point WORKING_DIR in the scripts to a location where you want the dataset to be downloaded and processed data to be stored.

2) The datasets consist of 5 scenes, 2 for GangnamStation and 3 for Hyundai Department Store. If you do not want to process all of them, modify the for loops in the scripts. For example, the visual localization challenge in the LTVL workshop (ICCV 2021) only requires B2 from Gangnam and 1F from Hyundai Department Store.

3) Execute the scripts.

cd kapture-localization/pipeline/examples
./run_gangnam.sh
./run_hyundai_dept_store.sh

4) If everything was successful, you should get a file named GangnamStation_LTVL2020_style_result_all_scenes_r2d2_WASF-N8_20k_Resnet101-AP-GeM-LM18.txt in ${WORKING_DIR}/GangnamStation and a file named HyundaiDepartmentStore_LTVL2020_style_result_all_scenes_r2d2_WASF-N8_20k_Resnet101-AP-GeM-LM18.txt in `${WORKING_DIR}/HyundaiDepartmentStore. These files can be uploaded to the benchmark at https://www.visuallocalization.net.