Skip to content

Latest commit

 

History

History
80 lines (65 loc) · 4.04 KB

README.md

File metadata and controls

80 lines (65 loc) · 4.04 KB

Triangulation Learning Network: from Monocular to Stereo 3D Object Detection

Watch the video

Created by Zengyi Qin, Jinglu Wang and Yan Lu. The repository contains an implementation of this CVPR paper. The detection pipeline is modified from AVOD.


Related Project

MonoGRNet: A Geometric Reasoning Network for 3D Object Localization

Please cite this paper if you find the repository helpful:

@article{qin2019tlnet, 
  title={Triangulation Learning Network: from Monocular to Stereo 3D Object Detection}, 
  author={Zengyi Qin and Jinglu Wang and Yan Lu},
  journal={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2019}
}

Introduction

we study the problem of 3D object detection from stereo images, in which the key challenge is how to effectively utilize stereo information. Different from previous methods using pixel-level depth maps, we propose to employ 3D anchors to explicitly construct geometric correspondences between the regions of interest in stereo images, from which the deep neural network learns to detect and triangulate the targeted object in 3D space. We also present a cost-efficient channel reweighting strategy that enhances representational features and weakens noisy signals to facilitate the learning process. All of these are flexibly integrated into a baseline detector, achieving state-of-the-art performance in 3D object detection and localization on the challenging KITTI dataset.

Prerequisites

  • Ubuntu 16.04
  • Python 3.6
  • Tensorflow 1.3.0

Setup

Clone this repository

git clone https://github.com/Zengyi-Qin/TLNet.git

Download the Kitti Object Detection Dataset (image left, image right, calib and label) and place it into your home folder ~/Kitti/object. Also download the train.txt, val.txt, trainval.txt, planes and score from here. The folder planes contains the ground planes parameters and score is the ground truth 2D objectness confidence maps. The data folder should be in the following format:

Kitti
    object
        testing
        training
            calib
            image_2
            image_3
            label_2
            planes
            score
        train.txt
        trainval.txt
        val.txt

Add tlnet to your PYTHONPATH:

export PYTHONPATH=$PYTHONPATH:'path/to/tlnet'

Run the following command to download the pretrained model, compile required modules and generate mini-batches for training:

python setup.py

Training

Run the training script with specific configs:

python avod/experiments/run_training.py --pipeline_config=avod/configs/pyramid_cars_with_aug_example.config --data_split='train' --device=GPU_TO_USE

Evaluation

python avod/experiments/run_evaluation.py --pipeline_config=avod/configs/pyramid_cars_with_aug_example.config --data_split='val' --device=GPU_TO_USE

Inference

python avod/experiments/run_inference.py --checkpoint_name='pyramid_cars_with_aug_example' --data_split='val' --ckpt_indices=-1 --device=GPU_TO_USE

where --ckpt_indices=-1 indicates running the lastest saved checkpoint. The difference between evaluation mode and inference mode is that, inference does not automatically perform Kitti official evaluation, while evaluation does.