Detection and Description of Articulated Objects Using RGB-D Sensor
Project Description
This project contributes to solving an ongoing problem in robotics: how to detect and predict motion constraints imposed by the environment. In this study an object is moving on a path following a straight line or is rotating around a stationary axis. Detecting these kinematic structures, alongside object geometries, is the purpose of this study.
The motion of objects was captured by a 3D camera, which produces a continuous RGBD data stream, as the objects move over a static background. These data are processed by an algorithm that includes the following steps: a basic depth image segmentation with RGBD-based background removal, 3D data filtering, and the Iterative Closest Point (ICP) algorithm. These steps are used for object point cloud compounding for objects in linear motion. Additionally, the rotary motion detection stage uses a RANSAC-based object plane detection for estimating planar surfaces of a flat object that rotates, like a door. I used C++ with OpenCV, Eigen, and PCL libraries on self-collected datasets consisting of Kinect 2 RGBD data streams to develop the programs.
Results
The motion of objects that were considered purely linear yielded a RMS error of 17mm for object translation over a 4m-long linear distance. Furthermore, the detection of a purely rotary motion (rotary door) yielded an axis localization error of <8mm, and a rotary axis orientation consistent within +/- 3 degrees range. This dataset included door rotation in the range of 70 degrees. The results served as a threshold of deviation for detecting “pure” linear and rotary motion, respectively. The developed algorithm was tested with additional motion sequences that included motion disturbances: deviations from linearity of the box and introduction of an additional rotary axis to the rotary dataset. Objects subjected to these disturbances were successfully classified as non-linear and non-revolute motion.
Limitations
The methods used in this project have proven to detect and classify linear and rotary motion of unknown objects under specific, favorable conditions. However, it is worth noting that the project is limited by the sparsity of data captured by the hardware used. It also relies on the assumption that the camera and scene background remain static. The detected objects featured large flat surfaces and sharp edges, which reduced the registration drift when compounding the point clouds, and the objects of interest were made of materials easily detectable by a 3D camera under a large range of viewing angles. Further improvements in the dataset acquisition pipeline could include the accuracy of motion estimation with respect to the world frame. A motion capture system could be used to also evaluate the object speed estimation accuracy.