Exploiting Multi-Layer Features Using a CNN-RNN Approach for RGB-D Object Recognition
This code provides an approach that brings a pre-trained CNN model together with multiple-fixed recursive neural networks (RNNs) for RGB-D object recognition. The pre-trained CNN model is used as a low-level feature extractor at different levels. The VGG-F model is particularly used in this work but other models can be used as well. The multiple RNNs have been employed to map the outputs of the CNN into a lower-dimensional space and to learn higher level representations by applying the same operations recursively in a tree structure. Thus, RNNs reduce the feature space dimensionality and allow us to transfer information from multiple layers effectively.
The approach proposed in this study is used for RGB-D object recognition. However, it can easily be applied to vision problems based on feature extraction such as detection, semantic segmentation, and action recognition.
The code is based on the implementation of "Socher, Richard, et al. "Convolutional-recursive deep learning for 3d object classification." NeurIPS (2012)". The implementation is in Matlab and requires MatConvNet. For the classification, Liblinear package is used.
This work is presented in ECCV 2018 Workshop: 3D Reconstruction meets Semantics. If you find this code useful for your research, please consider citing:
@inproceedings{exploitCNNRNN_ECCV18,
title = {Exploiting Multi-Layer Features Using a CNN-RNN Approach for RGB-D Object Recognition},
author = {Caglayan, Ali and Can, Ahmet Burak},
booktitle = {Computer Vision - ECCV 2018 Workshops},
editor = {Leal-Taixé, Laura and Roth, Stefan},
series = {Lecture Notes in Computer Science},
publisher = {Springer, Cham},
year = {2018},
volume = {11131},
pages = {675-688}
}