# Q：如何将RGB和深度从Kinect在MATLAB的图像

I am trying to align RGB and Depth image from Kinect using Matlab. I am trying to do it using the algorithm from this page.

Here is the code I have written so far

``````depth = imread('depth_00500.png');

rotationMat=[9.9984628826577793e-01 1.2635359098409581e-03 -1.7487233004436643e-02;
-1.4779096108364480e-03 9.9992385683542895e-01 -1.2251380107679535e-02;
1.7470421412464927e-02 1.2275341476520762e-02 9.9977202419716948e-01 ];

translationMat=[1.9985242312092553e-02, -7.4423738761617583e-04, -1.0916736334336222e-02 ];

%parameters for color matrix
fx_rgb= 5.2921508098293293e+02;
fy_rgb= 5.2556393630057437e+02;
cx_rgb= 3.2894272028759258e+02;
cy_rgb= 2.6748068171871557e+02;
k1_rgb= 2.6451622333009589e-01;
k2_rgb= -8.3990749424620825e-01;
p1_rgb= -1.9922302173693159e-03;
p2_rgb= 1.4371995932897616e-03;
k3_rgb= 9.1192465078713847e-01;

%parameters for depth matrix
fx_d= 5.9421434211923247e+02;
fy_d= 5.9104053696870778e+02;
cx_d= 3.3930780975300314e+02;
cy_d= 2.4273913761751615e+02;
k1_d= -2.6386489753128833e-01;
k2_d =9.9966832163729757e-01;
p1_d =-7.6275862143610667e-04;
p2_d =5.0350940090814270e-03;
k3_d =-1.3053628089976321e+00;

row_num=480;
col_num=640;

for row=1:row_num
for col=1:col_num

pixel3D(row,col,1) = (row - cx_d) * depth(row,col) / fx_d;
pixel3D(row,col,2) = (col - cy_d) * depth(row,col) / fy_d;
pixel3D(row,col,3) = depth(row,col);

end
end

pixel3D(:,:,1)=rotationMat*pixel3D(:,:,1)+translationMat;
pixel3D(:,:,2)=rotationMat*pixel3D(:,:,2)+translationMat;
pixel3D(:,:,3)=rotationMat*pixel3D(:,:,3)+translationMat;

P2Drgb_x = fx_rgb*pixel3D(:,:,1)/pixel3D(:,:,3)+cx_rgb;
P2Drgb_y = fy_rgb*pixel3D(:,:,2)/pixel3D(:,:,3)+cy_rgb;
``````

I am especially failing to understand why we're assigning value of depth pixel to dimension x,y and z of 3-dimensional space, shouldn't we assign (x,y,z) dimension to the depth pixel value?

I mean this part:

``````P3D.x = (x_d - cx_d) * depth(x_d,y_d) / fx_d
P3D.y = (y_d - cy_d) * depth(x_d,y_d) / fy_d
P3D.z = depth(x_d,y_d)
``````

Also I'm not sure If I can represent 3d space using matrix. I am trying to use it in my code but for sure it has wrong size as multiplication by 3x3 rotation matrix is impossible.

Thank you for very much for every suggestion and help!

``````depth = imread('depth_00500.png');

rotationMat=[9.9984628826577793e-01 1.2635359098409581e-03 -1.7487233004436643e-02;
-1.4779096108364480e-03 9.9992385683542895e-01 -1.2251380107679535e-02;
1.7470421412464927e-02 1.2275341476520762e-02 9.9977202419716948e-01 ];

translationMat=[1.9985242312092553e-02, -7.4423738761617583e-04, -1.0916736334336222e-02 ];

%parameters for color matrix
fx_rgb= 5.2921508098293293e+02;
fy_rgb= 5.2556393630057437e+02;
cx_rgb= 3.2894272028759258e+02;
cy_rgb= 2.6748068171871557e+02;
k1_rgb= 2.6451622333009589e-01;
k2_rgb= -8.3990749424620825e-01;
p1_rgb= -1.9922302173693159e-03;
p2_rgb= 1.4371995932897616e-03;
k3_rgb= 9.1192465078713847e-01;

%parameters for depth matrix
fx_d= 5.9421434211923247e+02;
fy_d= 5.9104053696870778e+02;
cx_d= 3.3930780975300314e+02;
cy_d= 2.4273913761751615e+02;
k1_d= -2.6386489753128833e-01;
k2_d =9.9966832163729757e-01;
p1_d =-7.6275862143610667e-04;
p2_d =5.0350940090814270e-03;
k3_d =-1.3053628089976321e+00;

row_num=480;
col_num=640;

for row=1:row_num
for col=1:col_num

pixel3D(row,col,1) = (row - cx_d) * depth(row,col) / fx_d;
pixel3D(row,col,2) = (col - cy_d) * depth(row,col) / fy_d;
pixel3D(row,col,3) = depth(row,col);

end
end

pixel3D(:,:,1)=rotationMat*pixel3D(:,:,1)+translationMat;
pixel3D(:,:,2)=rotationMat*pixel3D(:,:,2)+translationMat;
pixel3D(:,:,3)=rotationMat*pixel3D(:,:,3)+translationMat;

P2Drgb_x = fx_rgb*pixel3D(:,:,1)/pixel3D(:,:,3)+cx_rgb;
P2Drgb_y = fy_rgb*pixel3D(:,:,2)/pixel3D(:,:,3)+cy_rgb;
``````

``````P3D.x = (x_d - cx_d) * depth(x_d,y_d) / fx_d
P3D.y = (y_d - cy_d) * depth(x_d,y_d) / fy_d
P3D.z = depth(x_d,y_d)
``````

``````P3D.x = (x_d - cx_d) * depth(x_d,y_d) / fx_d
``````

In above line, depth(x_d, y_d) gives you the depth value at a pixel from the depth image. Then it is multiplied by the (x_d - cx_d), which is the difference along x-axis with the x coordinate of the centre point of depth map to the current pixel. Then finally this is divided by the fx_d, which is the focal length of the depth camera.

Following two references will help you to understand this mathematically well if you are interested in.

1. Mueller, K., Smolic, A., Dix, K., Merkle, P., Kauff, P., & Wiegand, T. (2008). View synthesis for advanced 3D video systems. EURASIP Journal on Image and Video Processing, 2008(1), 1-11.

2. Daribo, I., & Saito, H. (2011). A novel inpainting-based layered depth video for 3DTV. Broadcasting, IEEE Transactions on, 57(2), 533-541.

``````P3D.x = (x_d - cx_d) * depth(x_d,y_d) / fx_d
``````

1. 缪勒，K.，smolic，答，Dix，K.，Merkle，P.，kauff，P.，&；韦根，T（2008）。高级3D视频系统的视图合成。在图像和视频处理杂志，2008（1），1-11。

2. daribo，即&；西都，H（2011）。一种基于分层深度视频3DTV的修复。广播，IEEE，57（2），533-541。

matlab  kinect