11.3 Dense correspondence
few others thathave been developed speciﬁcallyfor stereo matching (ScharsteinandSzeliski
2002; Hirschm¨uller and Scharstein 2009).
The most common pixel-based matching costs include sums of squared intensity differ-
ences (SSD) (Hannah1974) and absolute intensity differences (SAD) (Kanade1994). In
the video processing community, these matching criteria are referred to as the mean-squared
error (MSE) and mean absolute difference (MAD) measures; the term displaced frame dif-
ference is also often used (Tekalp1995).
More recently, robust measures (8.2), including truncated quadratics and contaminated
Gaussians, have been proposed (Blackand Anandan1996;Blackand Rangarajan 1996;
Scharstein and Szeliski 1998). Thesemeasuresareusefulbecausetheylimittheinﬂuence
of mismatches during aggregation. . Vaish,Szeliski,Zitnicketal.(2006) compare a number
of such robust measures, including a new one based on the entropy of the pixel values at a
particular disparity hypothesis (Zitnick,Kang,Uyttendaeleetal.2004), which is particularly
useful in multi-view stereo.
Other traditional matching costs include normalized cross-correlation (8.11) (Hannah
1974; Bolles, Baker, and Hannah 1993; Evangelidis and Psarakis 2008), which behaves
similarly to sum-of-squared-differences (SSD), and binary matching costs (i.e., match or no
match) (MarrandPoggio1976), based on binary features such as edges (BakerandBinford
1981; Grimson 1985)orthesignoftheLaplacian(Nishihara 1984). Becauseoftheirpoor
discriminability, simple binary matching costs are no longer used in dense stereo matching.
Some costs are insensitive to differences in camera gain or bias, for example gradient-
based measures (Seitz1989;Scharstein1994), phase and ﬁlter-bank responses (Marrand
Poggio 1979; Kass 1988; Jenkin, Jepson, and Tsotsos 1991; Jones and Malik 1992),ﬁlters
that remove regular or robust (bilaterally ﬁltered) means (Ansar,Castano,andMatthies2004;
Hirschm¨uller and Scharstein 2009),densefeaturedescriptor(Tola, Lepetit, and Fua 2010),
andnon-parametric measures suchas rank andcensus transforms (ZabihandWoodﬁll1994),
ordinal measures (BhatandNayar1998), or entropy (Zitnick,Kang,Uyttendaeleetal.2004;
Zitnick and Kang 2007). Thecensustransform,whichconvertseachpixelinsideamoving
window into a bit vector representing which neighbors are above or below the central pixel,
was found byHirschm¨ullerandScharstein(2009) to be quite robust against large-scale, non-
stationary exposure and illumination changes.
It is also possible to correct for differing global camera characteristics by performing
apreprocessing or iterative reﬁnement step that estimates inter-image bias–gain variations
using global regression (Gennert1988), histogram equalization (Cox, Roy,andHingorani
1995),ormutualinformation(Kim, Kolmogorov, and Zabih 2003; Hirschm¨uller 2008). Lo-
cal, smoothly varying compensation ﬁelds have also been proposed (Strecha,Tuytelaars,and
Van Gool 2003; Zhang, McMillan, and Yu 2006).
In order to compensate for sampling issues, i.e., dramatically different pixel values in