There has been much confusion over how to specify frames for sensor_msgs/Image and sensor_msgs/CameraInfo messages and the correct use of parameters in CameraInfo for mono/stereo cameras. This proposal aims to establish a simple set of conventions and remove sources of confusion. It makes no changes to stable message formats.
Proposed conventions:
Given an Image, it is ONLY valid to go between image coordinates and 3D space using the corresponding CameraInfo. Hence:
frame_id in the Image header is NOT a real tf frame. Image.header.frame_id has no valid meaning in tf.
By convention, we set frame_id in the raw Image to the camera namespace, e.g. /narrow_stereo/left.
[Vijay] For consistency, maybe we should set the frame for raw images to /narrow_stereo/left_raw
For the rectified Image, we append "_rect", e.g. /narrow_stereo/left_rect.
CameraInfo.header.frame_id contains the tf frame which matrix P projects to image coordinates.
Users should always retrieve the tf frame from CameraInfo, never Image.
Both left and right CameraInfo have frame_id set to the left optical frame.
The right CameraInfo encodes the baseline in its P matrix.
- Matrix K is used only for rectification (with D and R). P is used for projection.
Stereo processing requires both Image messages, both CameraInfo messages, and no other information.
stereo_msgs/StereoInfo remains on the chopping block.
We do not need the 6-DOF transform between camera frames, only the 1-D baseline encoded in the right CameraInfo.
- We do not need the right camera frame at all. There will be no right optical tf frame in the robot model.
The 4x4 reprojection matrix (uvd1 -> xyzw) can be computed from both P matrices.