RGB-D Object Recognition: Features, Learning Algorithms and Multi-modal Fusion Strategies
RGB-D object recognition has become a very active research area in pattern recognition, computer vision and robotics with the rapid development of commodity depth cameras in recent years. The RGB and depth images of RGB-D data are acquired independently and have complementary visual information which shows the different ability of visual perception. This is not only the opportunity but also the challenge for object recognition research.
In this half-day ICPR 2016 tutorial, we will give a comprehensive overview of the current developments in RGB-D object recognition. As we know, it is also very important to discuss the relationship between conventional object recognition and RGB-D object recognition, so we firstly analyze the basic characteristics of RGB-D data and public released datasets. Then, we introduce the main feature representation methods for RGB-D data based on a proposed taxonomy to reveal the rationale behind those approaches. We focus on the problem of the redundancy and heterogeneity of modal representation because of the independence of data acquisition. Thirdly, the existing learning algorithms for RGB-D object recognition are introduced. We emphasize the way they take advantage of the complementarity of RGB-D data especially when the labeled data is limited. The multi-modal fusion strategies are discussed in the fourth part. We categorize current methods into three groups: feature level, decision level and hybrid fusion, which may be helpful for researchers to better understand the state-of-the-art methods.
In summary, this tutorial studies the RGB-D object recognition thoroughly in order to provide some basic concepts, state-of-the-art learning algorithms and future directions. We hope to motivate the researchers to utilize RGB-D data to solve object recognition problems in the near future