Kiretu

Kiretu contains four classes for creating a point cloud:
This page describes the functionality of the classes. It concentrates on the theoretical background of the classes. For aspects of programming see Data Structures.
The YMLParser imports the Kinect specific calibration file which includes the Kinect’s extrinsics and the depth and RGBcamera’s intrinsics. Afterwards, it parses all the parameters.
The FrameGrabber captures frames of the depth RGBcamera. One problem of the depthstream is image noise (concerning the depthvalues). You can see this at the glview
depth map:
I analyzed this issue capturing the following test scene showing objects with different materials and degrees of reflection:
To get as many valid depthvalues as possible, the FrameGrabber is able to grab a variable number of frames (images). A depthvalue is called valid, if . After grabbing all frames, it computes the mean of each pixel’s valid depth values.
Here you can see relation between the number of frames and points (concerning the test scene):
I choose 50 frames as a compromise between the number of points and the time of capturing.
A problem occurs, if the depth values of a pixel vary too much over the frames. E. g. the pixels at an object’s edge, which switch between the object’s depth and background’s depth behind the object. You can see this effect in the following two images. Both show a point cloud of the test scene. At the first picture, only one frame was captured. The second image shows the problematic points after taking the mean of 50 frames.
The affected points are characterized by highly varying depth values. As measure of this diversity is the standard deviation [1]. For this reason it is useful to take a look at the frequency of the standard deviation of the depth values around the equivalent mean:
Now, the idea is to use the standard deviation as a threshold to recognize and sort out problematic points. This implicates, that the value of determines the final number of points. Here you can see the total number of points depending on the standard deviation:
I choose as the default value.
Finally, the question is, how many valid values can be achieved by this optimization. Here are the results (maximum ):
1 frame  292650 points  
50 frames without threshold  297038 points  (+4388, 1.5 %) 
50 frames with threshold  295461 points  (+2811, 1.0 %) 
Based on the content of Reconstruction, this section explains how the reconstruction of the point cloud is done by KinectCloud.
Important: All references to equations refer to the numbers given in Reconstruction.
First, you have to convert the raw depthvalues of Kinect ( ) into meters. This is done by the following, experimental determined formula [2]:
Then, we project the depthimage into the threedimensional space. Let be a pixel at the captured depthimage and the depth value of the pixel, converted into meter. Using equation (5), the  and coordinate of the equivalent threedimensional point can be computed as follows:
Regard, that and are the parameters of the depthcamera!
The point cloud is given in coordinates of the depthcamera’s coordinate system. We now have to transform all points to the RGBcamera’s coordinate system. This is done by applying the extrinsic parameters to each point:
Finally, we reproject the point cloud onto the RGBimage to get the equivalent color of each point. Therefor, we can use equation (5), again. Let be the point in the space. We can compute the corresponding pixel at the RGBimage as follows:
Regard, that and are the parameters of the RGBcamera!
For better understanding, here is a summary of the four reconstruction steps:
The CloudWriter generates and saves a ply point cloud. Every point of the cloud gets assigned to its corresponding color value of the RGBimage with the coordinate .
Because of the different positions/orientations of the depth and RGBcamera, it is possible, that points without a corresponding color exist. It is possible to discard them or print them in a userdefined color.
The reconstructionsteps are recognizable in the filename, e. g.
cloud20120116172751MDEr.ply
where M, D, E and r relate to the reconstructionsteps as described at the summary above and upper case indicates an executed reconstructionstep, while lower case means the opposite.