Review of Local Descriptor in RGB-D Object Recognition
The emergence of an RGB-D (Red-Green-Blue-Depth) sensor which is capable of providing depth and RGB images gives hope to the computer vision community. Moreover, the use of local features began to increase over the last few years and has shown impressive results, especially in the field of object recognition. This article attempts to provide a survey of the recent technical achievements in this area of research. We review the use of local descriptors as the feature representation which is extracted from RGB-D images, in instances and category-level object recognition. We also highlight the involvement of depth images and how they can be combined with RGB images in constructing a local descriptor. Three different approaches are used in involving depth images into compact feature representation, that is classical approach using distribution based, kernel-trick, and feature learning. In this article, we show that the involvement of depth data successfully improves the accuracy of object recognition.
M. Blum, J. T. Springenberg, J. Wulfing, and M. Riedmiller, “A learned feature descriptor for object recognition in RGB-D data,” in 2012 IEEE International Conference on Robotics and Automation (ICRA), 2012, pp. 1298–1303.
L. Cruz, D. Lucio, and L. Velho, “Kinect and RGBD Images: Challenges and Applications,” 2012 25th SIBGRAPI Conference on Graphics, Patterns and Images Tutorials, pp. 36–49, Aug. 2012.
K. Lai, L. Bo, X. Ren, and D. Fox, “A large-scale hierarchical multi-view RGB-D object dataset,” in ICRA, 2011, pp. 1817–1824.
L. Bo, K. Lai, X. Ren, and D. Fox, “Object recognition with hierarchical kernel descriptors,” in 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2011, pp. 1729–1736.
L. Bo, X. Ren, and D. Fox, “Depth kernel descriptors for object recognition,” in 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2011, pp. 821–826.
L. Bo, X. Ren, and D. Fox, “Unsupervised Feature Learning for RGB-D Based Object Recognition,” in In International Symposium on Experimental Robotics (ISER, 2012.
S. Tang, X. Wang, X. Lv, T. Han, J. Keller, Z. He, M. Skubic, and S. Lao, “Histogram of Oriented Normal Vectors for Object Recognition with a Depth Sensor,” in Computer Vision – ACCV 2012, vol. 7725, K. Lee, Y. Matsushita, J. Rehg, and Z. Hu, Eds. Springer Berlin Heidelberg, 2013, pp. 525–538.
D. Prasad, “Survey of the problem of object detection in real images,” International Journal of Image Processing (IJIP), no. 6, pp. 441–466, 2012.
J. Han, L. Shao, D. Xu, and J. Shotton, “Enhanced computer vision with Microsoft Kinect sensor: a review.,” IEEE transactions on cybernetics, vol. 43, no. 5, pp. 1318–34, Oct. 2013.
K. Grauman and B. Leibe, “Visual Object Recognition,” Synthesis Lectures on Artificial Intelligence and Machine Learning, vol. 5, no. 2, pp. 1–181, Apr. 2011.
X. Zhang, Y.-H. Yang, Z. Han, H. Wang, and C. Gao, “Object Class Detection: A Survey,” ACM Comput. Surv., vol. 46, no. 1, pp. 10:1–10:53, 2013.
D. G. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints,” International Journal of Computer Vision, vol. 60, pp. 91–110, 2004.
H. Bay, A. Ess, T. Tuytelaars, and L. Van Gool, “Speeded-Up Robust Features (SURF),” Comput. Vis. Image Underst., vol. 110, no. 3, pp. 346–359, 2008.
A. E. Johnson and M. Hebert, “Using Spin Images for Efficient Object Recognition in Cluttered 3D Scenes,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 21, no. 5, pp. 433–449, 1999.
L. Bo and C. Sminchisescu, “Efficient Match Kernel between Sets of Features for Visual Recognition,” in Advances in Neural Information Processing Systems 22, Y. Bengio, D. Schuurmans, J. D. Lafferty, C. K. I. Williams, and A. Culotta, Eds. Curran Associates, Inc., 2009, pp. 135–143.
B. C. Russell, A. Torralba, K. P. Murphy, and W. T. Freeman, “LabelMe: A Database and Web-Based Tool for Image Annotation,” Int. J. Comput. Vision, vol. 77, no. 1–3, pp. 157–173, 2008.
L. Bo, X. Ren, and D. Fox, “Kernel Descriptors for Visual Recognition,” in Advances in Neural Information Processing Systems, 2010.
A. Coates, H. Lee, and A. Y. Ng, “An analysis of single-layer networks in unsupervised feature learning,” in Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 2011, vol. 15, pp. 215–223.
B. Steder, R. B. Rusu, K. Konolige, and W. Burgard, “Point feature extraction on 3D range scans taking into account object boundaries,” in Robotics and Automation (ICRA), 2011 IEEE International Conference on, 2011, pp. 2601–2608.
L. Bo, X. Ren, and D. Fox, “Hierarchical Matching Pursuit for Image Classification: Architecture and Fast Algorithms,” in Advances in Neural Information Processing Systems 24, J. Shawe-Taylor, R. S. Zemel, P. L. Bartlett, F. Pereira, and K. Q. Weinberger, Eds. Curran Associates, Inc., 2011, pp. 2115–2123.
M. Aharon, M. Elad, and A. Bruckstein, “K -SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation,” Signal Processing, IEEE Transactions on, vol. 54, no. 11, pp. 4311–4322, 2006.
N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, 2005, vol. 1, pp. 886–893 vol. 1.
X. Wang, T. X. Han, and S. Yan, “An HOG-LBP human detector with partial occlusion handling,” in Computer Vision, 2009 IEEE 12th International Conference on, 2009, pp. 32–39.
K. Lai, L. Bo, X. Ren, and D. Fox, “Sparse Distance Learning for Object Recognition Combining RGB and Depth Information,” in IEEE International Conference on on Robotics and Automation, 2011.
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
TELKOMNIKA Telecommunication, Computing, Electronics and Control
ISSN: 1693-6930, e-ISSN: 2302-9293
Universitas Ahmad Dahlan, 4th Campus
Jl. Ringroad Selatan, Kragilan, Tamanan, Banguntapan, Bantul, Yogyakarta, Indonesia 55191
Phone: +62 (274) 563515, 511830, 379418, 371120
Fax: +62 274 564604