Comparison between Large Displacement Optical Flow Algorithms
Background
Camera Basics
A digital camera captures color images by recording data in different channels of the visible spectrum (red, green, blue). Most cameras accomplish this by using a single type of CCD or CMOS sensor at each pixel. This sensor measures the intensity of light that is focused on it, but cannot distinguish between colors. Therefore, a color filter is placed over each sensor, and the sensor only records the intensity of a specific color at that pixel. The arrangement of these color filters over an image is known as a Color Filter Array. Figure 1 shows a very common CFA known as the Bayer Array.

In the Bayer Array, green color filters are arranged in a checkerboard pattern, whereas the red and blue color filters are arranged by alternating rows. The original reasoning for using this distribution of colors was to mimic the physiology of the human eye. During daylight vision, the luminance perception of the retina uses both L and M cones, which are more sensitive to green light. The red and blue filters control the sensitivity of the eye to chrominance.
CFA Interpolation
Since each color channel is only tallied at specific coordinates, the remaining pixels in that channel must be estimated in some way. There are many interpolation schemes - a few simple examples are the bilinear,bicubic, and smooth hue transition interpolations shown below. Other interpolation schemes, such as median filter, gradient-based, and adaptive color plane, become increasingly complex. A wide survey of interpolation methods is available, along with an analysis of the advantages and disadvantages of each.
Bilinear/Bicubic
A linear combination of the nearest N neighbors in any direction for N = 1 (bilinear) or N = 3 (bicubic). For a given matrix of known values R(x,y), the interpolated set of values r(x,y) can be computed:
where hr is the 2-D kernel containing interpolation weights based on relative neighbors of a given pixel.
Smooth Hue Transition
Operates on the assumption that the hue of a natural image (chrominance/luminance, or red/green and blue/green) changes smoothly in local regions of that image. In order to calculate this, the green channel is first interpolated using bilinear interpolation. A missing red point r(x,y) is calculated from known red values R(x,y) and green values G(x,y).
The sum shown above is valid for when the interpolation pixel has two adjacent pixels in the same row. The equation is altered slightly for two adjacent pixels in the same column, or when there are four adjacent pixels in the corners. The missing blue pixels are calculated in the same way.
Detecting Forgeries
Regardless of the type of interpolation used in a given image, the estimated pixels should exhibit a strong correlation, or dependence, on their surrounding pixels. If one can categorize each pixel as either correlated or independent with respect to other pixels, one should see a periodic pattern that mimics the CFA used to construct the image. Even if an altered image does not have any visual cues that point to its forgery, inspecting the correlation of pixels to one another can potentially expose which parts of an image were altered. It is important to note that, even if an image has been altered, the mentioned correlations may still be kept intact. In this way, detecting forgeries through CFA interpolation is one tool out of many used to detect forgeries.