The iteration-based methods regress disparity error by predicting residual disparity ∆dk, while SR-Stereo splits the disparity error into multiple segments and regresses them by predicting multiple disparity clips.
Compared to iteration-based methods, SR-Stereo is specially designed in terms of the update unit and the regression objective. Specifically, we propose a stepwise regression unit that outputs range-controlled disparity clips, rather than unconstrained residual disparities. Further, we design separate regression objectives for each stepwise regression unit, instead of simply using the disparity error.
First, a robust stereo model SR-Stereo and a lightweight edge estimator are pre-trained on a large synthetic dataset with dense ground truth. Then, we use the pre-trained SR-Stereo and edge estimator to generate the edge map of target domain, where the background pixels (i.e., non-edge region pixels) are used as edge pseudo-labels. Finally, we jointly fine-tune the pre-trained SR-Stereo using the edge pseudo-labels and sparse ground truth disparity.
Domain-adaptive visualization on KITTI:
Qualitative disparity estimation results of DAPE on ETH3D:
Qualitative disparity estimation results of DAPE on KITTI test set: