Semi-Global Matching Algorithm
- Updated2025-11-25
- 3 minute(s) read
This algorithm aims to minimize the following global energy function, E, for disparity image, D.
with P2≥P1
- E(D) is the energy for disparity image D,
- p, q represent indices for pixels in the image,
- Np is the neighborhood of the pixel p,
- C(p, Dp) is the cost of pixel matching with disparity in Dp,
- P1 is the penalty passed by the user for change in disparity between neighboring pixels by 1,
- P2 is the penalty passed by the user for change in disparity between neighboring pixels by value greater than 1,
- I[.] is the function which returns 1 if argument is true, 0 otherwise the minimized function produces a perfect disparity map with smoothing governed by parameters P1 and P2; however, minimizing the function for a 2D image space is a NP-complete problem.
The semi-global matching function approximates the 2D minimization by performing multiple 1D, or linear, minimizations. The matching function aggregates costs on multiple paths which converge on the pixel under examination. Cost is computed for the disparity range specified by the minimum disparity and number of disparities parameters. By default, the matching algorithm aggregates costs for 5 directions. You can set the full dynamic programming parameter to true to force the algorithm to aggregate costs for 8 directions. Let, S(p, d) be the aggregate cost for pixel p and disparity d.
Then
where r is a direction used for converging to the pixel p Lr(p, d) is the minimum cost of the path taken in direction r from pixel (p for disparity d) The cost Lr(p, d) is given in the following equation:
The equation uses the following costs to find the disparity by adding current cost, C(p, d, to previous pixel in direction r:
- The minimum of the cost at previous pixel with disparity d,
- The cost at previous pixel with disparity d - 1 and d + 1 with added penalty P1,
- The cost at previous pixel with disparities less than d - 1 and greater than d + 1 with added penalty P2.
In order to limit the ever increasing value of Lr(p, d) on the path, minimum value of the previous pixel is subtracted. The upper value of Lr(p, d) is bounded by Cmax + P2, where Cmax is the maximum value of cost C. The cost function C(p, d) is computed in the following manner:
where IL and IR are left and right rectified images, respectively
The value of C is aggregated over a window of a user-defined size[1]1 Birchfield and C. Tomasi, Depth Discontinuities by Pixel-to-Pixel Stereo, IJCV, vol. 35(3), pp. 269-293, 1999.. After computing S(p, d) for each pixel p for each disparity d, the algorithm chooses the disparity which provides the minimum cost for that pixel.
The complexity of this algorithm is given in the following equation:
- w equals the width of the rectified image,
- h equals the height of the rectified image,
- n equals the number of disparities.
1 Birchfield and C. Tomasi, Depth Discontinuities by Pixel-to-Pixel Stereo, IJCV, vol. 35(3), pp. 269-293, 1999.