Robotics Fundamentals Series: Stereo Vision

Publish Date: Dec 10, 2008 | 2 Ratings | 4.50 out of 5 |  PDF

Autonomous mobile robots make use of stereo vision to detect obstacles and measure their relative distance to themselves for path planning. This method is also relatively inexpensive; while laser scanners can cost tens of thousands of dollars, stereo vision requires only two aligned cameras and some processing power.

Stereo vision is a technique that uses two cameras to measure distances from the cameras, similar to human depth perception with human eyes. The process uses two parallel cameras aligned at a known distance of separation. Each camera captures an image and these images are analyzed for common features. Triangulation is used with the relative position of these matched pixels in the images as seen in Figure 1 below.

Figure 1: Stereo Vision Triangulation

Triangulation requires knowing the focal length of the camera (f), the distance between the camera bases (b), and the center of the images on the image plane (c1 and c2). Disparity (d) is the difference between the lateral distances to the feature pixel (v2 and v1) on the image plane from their respective centers. Using the concept of similar triangles, the distance from the cameras (D) is calculated as D = b * f / d.

The result for the computer vision system is a depth field map which is a grayscale image of equal size to the original image. Each gray level represents a distance from the camera. For example, a black pixel identifies a pixel in the computer’s vision as being at infinity distance and a white pixel signifies being at near-infinity. This processing can be on a computer, but some cameras do exist that do the processing on-board using FPGA.

Custom algorithms can be easily implemented within LabVIEW using the NI Vision Development Module to perform stereoscopy. With the two cameras, use either blob detection or pattern matching to first detect the object, then find the pixel coordinates from both images and finally apply the algorithm that translates the pixel position discrepancies into distance and size. To learn more about image analysis, refer to Image Analysis and Processing at

Stereo vision was implemented in Virginia Tech's 2005 DARPA Grand Challenge Entry, "Cliff", using LabVIEW and NI Vision. Virginia Tech used a Point Grey Bumblebee stereo vision camera which is capable of processing image points to a range of 30-40m.  The stereo camera was used in combination with a SICK LMS-291 scanning LIDAR to evaluate the vehicle's surroundings. Of 195 original teams, Virginia tech fielded two in the top ten and was the highest-achieving independent university team. 

To learn more about robotics, refer to the Robotics Fundamentals Series homepage at 

See Also:

NI Vision Development Module
Virginia Tech Uses Virtual Instrumentation to Develop Autonomous Vehicles to Compete in the DARPA Grand Challenge


Back to Top

Bookmark & Share


Rate this document

Answered Your Question?
Yes No