(!) Please ask about problems and questions regarding this tutorial on answers.ros.org. Don't forget to include in your question the link to this page, the versions of your OS & ROS, and also add appropriate tags.

Introduction to Sparse Bundle Adjustment

Description: This tutorial introduces Sparse Bundle Adjustment (SBA), its uses, and some basic concepts.

Keywords: sba, vision

Tutorial Level: BEGINNER

Next Tutorial: Performing SBA on Data from a File

What is SBA?

Sparse Bundle Adjustment, or SBA, is used for reconstructing 3D structure and camera pose from a series of monocular or stereo images. The technique is usually used for refining existing estimates of 3D geometry and camera pose.

See the Wikipedia page on Bundle Adjustment and "Bundle Adjustment - A Modern Synthesis" by Triggs et al. for more information.

Why use it?

SBA is used for many applications, including improving estimates from visual odometry for use in visual SLAM, reconstructing object geometry, localizing tourist photos, and many more. In most application involving uncertain camera pose and uncertain point positions, SBA would improve the position estimates by minimizing a particular cost function.


A keypoint is a feature in an image. In the case of a stereo image, a keypoint would have a match in the other image. In monocular images, a keypoint is defined by the image coordinates u, v. In stereo images, a keypoint is defined by u, v, d with d the disparity between matching points in the stereo images.

A point or world point is the representation of a keypoint in world coordinates, expressed as a homogenous coordinate x, y, z.

A node contains information about camera pose and camera parameters. A camera pose is expressed in terms of 6 variables: translation in x, y, z and rotation expressed in unit quaternions in one hemisphere (with a w > 0).

The projection of a point or world point is the location it would be in an image given a camera pose.

A projection is used to calculate the reprojection error, or the difference between the location of a keypoint in an image and its reprojection into the image plane after computing a world coordinate and camera pose for the point.

A track is an ordered collection of the projections of a particular world point into images from different camera poses.

Constraints define the relationships between cameras and points. For example, a projection is a point-to-camera constraint: it defines the relationship between camera position and the point position. Likewise, knowing odometry data between two camera poses would be a camera-to-camera constraint, and having a known distance between two points would be a point-to-point constraint.


What do you need to run SBA?

Running SBA requires the following data:

  • Camera pose estimates for a series of camera poses.

  • Point position estimates in the world coordinate frame.

  • Projections of a given point into every frame in which it is visible. That is, the image coordinate location of each point in every image in which it is seen.

Notice that SBA requires an estimate of the camera pose and point positions to be already computed and refines these estimates to better fit all data. The method that is appropriate to generate these data varies by application, but for uses such as visual SLAM, visual odometry is the preferred method.

Now that you know the basics, continue on to the Performing SBA on data from a file tutorial.

Wiki: sba/Tutorials/IntroductionToSBA (last edited 2010-07-06 22:58:09 by HelenOleynikova)